idnits 2.17.1 draft-ietf-teas-rfc3272bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 13, 2020) is 1355 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 3272 (Obsoleted by RFC 9522) -- Obsolete informational reference (is this intentional?): RFC 5575 (Obsoleted by RFC 8955) -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) -- Obsolete informational reference (is this intentional?): RFC 7810 (Obsoleted by RFC 8570) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group A. Farrel, Ed. 3 Internet-Draft Old Dog Consulting 4 Obsoletes: 3272 (if approved) July 13, 2020 5 Intended status: Informational 6 Expires: January 14, 2021 8 Overview and Principles of Internet Traffic Engineering 9 draft-ietf-teas-rfc3272bis-01 11 Abstract 13 This document describes the principles of Traffic Engineering (TE) in 14 the Internet. The document is intended to promote better 15 understanding of the issues surrounding traffic engineering in IP 16 networks and the networks that support IP networking, and to provide 17 a common basis for the development of traffic engineering 18 capabilities for the Internet. The principles, architectures, and 19 methodologies for performance evaluation and performance optimization 20 of operational networks are discussed throughout this document. 22 This work was first published as RFC 3272 in May 2002. This document 23 obsoletes RFC 3272 by making a complete update to bring the text in 24 line with best current practices for Internet traffic engineering and 25 to include references to the latest relevant work in the IETF. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on January 14, 2021. 44 Copyright Notice 46 Copyright (c) 2020 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (https://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 62 1.1. What is Internet Traffic Engineering? . . . . . . . . . . 4 63 1.2. Components of Traffic Engineering . . . . . . . . . . . . 7 64 1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 9 66 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 12 67 2.1. Context of Internet Traffic Engineering . . . . . . . . . 12 68 2.2. Network Context . . . . . . . . . . . . . . . . . . . . . 13 69 2.3. Problem Context . . . . . . . . . . . . . . . . . . . . . 15 70 2.3.1. Congestion and its Ramifications . . . . . . . . . . 16 71 2.4. Solution Context . . . . . . . . . . . . . . . . . . . . 17 72 2.4.1. Combating the Congestion Problem . . . . . . . . . . 19 73 2.5. Implementation and Operational Context . . . . . . . . . 22 74 3. Traffic Engineering Process Models . . . . . . . . . . . . . 23 75 3.1. Components of the Traffic Engineering Process Model . . . 23 76 4. Review of TE Techniques . . . . . . . . . . . . . . . . . . . 24 77 4.1. Overview of IETF Projects Related to Traffic Engineering 24 78 4.1.1. Constraint-Based Routing . . . . . . . . . . . . . . 24 79 4.1.2. Integrated Services . . . . . . . . . . . . . . . . . 25 80 4.1.3. RSVP . . . . . . . . . . . . . . . . . . . . . . . . 26 81 4.1.4. Differentiated Services . . . . . . . . . . . . . . . 27 82 4.1.5. MPLS . . . . . . . . . . . . . . . . . . . . . . . . 28 83 4.1.6. Generalized MPLS . . . . . . . . . . . . . . . . . . 28 84 4.1.7. IP Performance Metrics . . . . . . . . . . . . . . . 29 85 4.1.8. Flow Measurement . . . . . . . . . . . . . . . . . . 29 86 4.1.9. Endpoint Congestion Management . . . . . . . . . . . 30 87 4.1.10. TE Extensions to the IGPs . . . . . . . . . . . . . . 30 88 4.1.11. Link-State BGP . . . . . . . . . . . . . . . . . . . 30 89 4.1.12. Path Computation Element . . . . . . . . . . . . . . 31 90 4.1.13. Application-Layer Traffic Optimization . . . . . . . 31 91 4.1.14. Segment Routing with MPLS encapsuation (SR-MPLS) . . 32 92 4.1.15. Network Virtualization and Abstraction . . . . . . . 33 93 4.1.16. Deterministic Networking . . . . . . . . . . . . . . 34 94 4.1.17. Network TE State Definition and Presentation . . . . 34 95 4.1.18. System Management and Control Interfaces . . . . . . 34 96 4.2. Content Distribution . . . . . . . . . . . . . . . . . . 34 98 5. Taxonomy of Traffic Engineering Systems . . . . . . . . . . . 35 99 5.1. Time-Dependent Versus State-Dependent Versus Event 100 Dependent . . . . . . . . . . . . . . . . . . . . . . . . 36 101 5.2. Offline Versus Online . . . . . . . . . . . . . . . . . . 37 102 5.3. Centralized Versus Distributed . . . . . . . . . . . . . 37 103 5.3.1. Hybrid Systems . . . . . . . . . . . . . . . . . . . 38 104 5.3.2. Considerations for Software Defined Networking . . . 38 105 5.4. Local Versus Global . . . . . . . . . . . . . . . . . . . 38 106 5.5. Prescriptive Versus Descriptive . . . . . . . . . . . . . 38 107 5.5.1. Intent-Based Networking . . . . . . . . . . . . . . . 39 108 5.6. Open-Loop Versus Closed-Loop . . . . . . . . . . . . . . 39 109 5.7. Tactical vs Strategic . . . . . . . . . . . . . . . . . . 39 110 6. Recommendations for Internet Traffic Engineering . . . . . . 39 111 6.1. Generic Non-functional Recommendations . . . . . . . . . 40 112 6.2. Routing Recommendations . . . . . . . . . . . . . . . . . 42 113 6.3. Traffic Mapping Recommendations . . . . . . . . . . . . . 44 114 6.4. Measurement Recommendations . . . . . . . . . . . . . . . 45 115 6.5. Network Survivability . . . . . . . . . . . . . . . . . . 46 116 6.5.1. Survivability in MPLS Based Networks . . . . . . . . 48 117 6.5.2. Protection Option . . . . . . . . . . . . . . . . . . 49 118 6.6. Traffic Engineering in Diffserv Environments . . . . . . 50 119 6.7. Network Controllability . . . . . . . . . . . . . . . . . 52 120 7. Inter-Domain Considerations . . . . . . . . . . . . . . . . . 53 121 8. Overview of Contemporary TE Practices in Operational IP 122 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 55 123 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 59 124 10. Security Considerations . . . . . . . . . . . . . . . . . . . 59 125 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 59 126 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 59 127 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 61 128 14. Informative References . . . . . . . . . . . . . . . . . . . 62 129 Appendix A. Historic Overview . . . . . . . . . . . . . . . . . 71 130 A.1. Traffic Engineering in Classical Telephone Networks . . . 71 131 A.2. Evolution of Traffic Engineering in Packet Networks . . . 72 132 A.2.1. Adaptive Routing in the ARPANET . . . . . . . . . . . 73 133 A.2.2. Dynamic Routing in the Internet . . . . . . . . . . . 73 134 A.2.3. ToS Routing . . . . . . . . . . . . . . . . . . . . . 74 135 A.2.4. Equal Cost Multi-Path . . . . . . . . . . . . . . . . 74 136 A.2.5. Nimrod . . . . . . . . . . . . . . . . . . . . . . . 75 137 A.3. Development of Internet Traffic Engineering . . . . . . . 75 138 A.3.1. Overlay Model . . . . . . . . . . . . . . . . . . . . 75 139 Appendix B. Overview of Traffic Engineering Related Work in 140 Other SDOs . . . . . . . . . . . . . . . . . . . . . 76 141 B.1. Overview of ITU Activities Related to Traffic Engineering 76 142 Appendix C. Summary of Changes Since RFC 3272 . . . . . . . . . 77 143 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 77 145 1. Introduction 147 This memo describes the principles of Internet traffic engineering. 148 The objective of the document is to articulate the general issues and 149 principles for Internet traffic engineering; and where appropriate to 150 provide recommendations, guidelines, and options for the development 151 of online and offline Internet traffic engineering capabilities and 152 support systems. 154 This document can aid service providers in devising and implementing 155 traffic engineering solutions for their networks. Networking 156 hardware and software vendors will also find this document helpful in 157 the development of mechanisms and support systems for the Internet 158 environment that support the traffic engineering function. 160 This document provides a terminology for describing and understanding 161 common Internet traffic engineering concepts. This document also 162 provides a taxonomy of known traffic engineering styles. In this 163 context, a traffic engineering style abstracts important aspects from 164 a traffic engineering methodology. Traffic engineering styles can be 165 viewed in different ways depending upon the specific context in which 166 they are used and the specific purpose which they serve. The 167 combination of styles and views results in a natural taxonomy of 168 traffic engineering systems. 170 Even though Internet traffic engineering is most effective when 171 applied end-to-end, the focus of this document is traffic engineering 172 within a given domain (such as an autonomous system). However, 173 because a preponderance of Internet traffic tends to originate in one 174 autonomous system and terminate in another, this document provides an 175 overview of aspects pertaining to inter-domain traffic engineering. 177 This work was first published as [RFC3272] in May 2002. This 178 document obsoletes [RFC3272] by making a complete update to bring the 179 text in line with best current practices for Internet traffic 180 engineering and to include references to the latest relevant work in 181 the IETF. It is worth noting around three fifths of the RFCs 182 referenced in this document post-date the publication of RFC 3272. 183 Appendix C provides a summary of changes between RFC 3272 and this 184 document. 186 1.1. What is Internet Traffic Engineering? 188 The Internet exists in order to transfer information from source 189 nodes to destination nodes. Accordingly, one of the most significant 190 functions performed by the Internet is the routing of traffic from 191 ingress nodes to egress nodes. Therefore, one of the most 192 distinctive functions performed by Internet traffic engineering is 193 the control and optimization of the routing function, to steer 194 traffic through the network. 196 Internet traffic engineering is defined as that aspect of Internet 197 network engineering dealing with the issue of performance evaluation 198 and performance optimization of operational IP networks. Traffic 199 Engineering encompasses the application of technology and scientific 200 principles to the measurement, characterization, modeling, and 201 control of Internet traffic [RFC2702], [AWD2]. 203 Ultimately, it is the performance of the network as seen by end users 204 of network services that is truly paramount. The characteristics 205 visible to end users are the emergent properties of the network, 206 which are the characteristics of the network when viewed as a whole. 207 A central goal of the service provider, therefore, is to enhance the 208 emergent properties of the network while taking economic 209 considerations into account. This is accomplished by addressing 210 traffic oriented performance requirements, while utilizing network 211 resources economically and reliably. Traffic oriented performance 212 measures include delay, delay variation, packet loss, and throughput. 214 Internet traffic engineering responds to network events. Aspects of 215 capacity management respond at intervals ranging from days to years. 216 Routing control functions operate at intervals ranging from 217 milliseconds to days. Packet level processing functions operate at 218 very fine levels of temporal resolution, ranging from picoseconds to 219 milliseconds while responding to the real-time statistical behavior 220 of traffic. 222 Thus, the optimization aspects of traffic engineering can be viewed 223 from a control perspective and can be pro-active and/or reactive. In 224 the pro-active case, the traffic engineering control system takes 225 preventive action to obviate predicted unfavorable future network 226 states such as e.g. engineering a backup path. It may also take 227 perfective action to induce a more desirable state in the future. In 228 the reactive case, the control system responds correctively and 229 perhaps adaptively to events that have already transpired in the 230 network, such as routing after failure. 232 Another important objective of Internet traffic engineering is to 233 facilitate reliable network operations [RFC2702]. Reliable network 234 operations can be facilitated by providing mechanisms that enhance 235 network integrity and by embracing policies emphasizing network 236 survivability. This results in a minimization of the vulnerability 237 of the network to service outages arising from errors, faults, and 238 failures occurring within the infrastructure. 240 The optimization aspects of traffic engineering can be achieved 241 through capacity management and traffic management. As used in this 242 document, capacity management includes capacity planning, routing 243 control, and resource management. Network resources of particular 244 interest include link bandwidth, buffer space, and computational 245 resources. Likewise, as used in this document, traffic management 246 includes (1) nodal traffic control functions such as traffic 247 conditioning, queue management, scheduling, and (2) other functions 248 that regulate traffic flow through the network or that arbitrate 249 access to network resources between different packets or between 250 different traffic streams. 252 One major challenge of Internet traffic engineering is the 253 realization of automated control capabilities that adapt quickly and 254 cost effectively to significant changes in a network's state, while 255 still maintaining stability of the network. Results from performance 256 evaluation assessing the effectiveness of traffic engineering methods 257 can be used to identify existing problems, guide network re- 258 optimization, and aid in the prediction of potential future problems. 259 However this process can also be time consuming and may not be 260 suitable to act on sudden, ephemeral changes in the network. 262 Performance evaluation can be achieved in many different ways. The 263 most notable techniques include analytical methods, simulation, and 264 empirical methods based on measurements. 266 In genaral, traffic engineering comes in two flavors. Either as a 267 background process that constantly monitors traffic and optimize the 268 usage of resources to improve performance, or in form of a pre- 269 planned optimized traffic distribution that is considered optimal. 270 In the later case, any deviation from the optimum distribution (e.g., 271 caused by a fiber cut) is reverted upon repair without further 272 optimization. However, this form of traffic engineering heavily 273 relies upon the notion that the planned state of the network is 274 indeed optimal. Hence, in such a mode there are two levels of 275 traffic engineering: the TE-planning task to enable an optimum 276 traffic distribution, and the routing task keeping traffic flows 277 attached to the pre-planned distribution 279 As a general rule, traffic engineering concepts and mechanisms must 280 be sufficiently specific and well-defined to address known 281 requirements, but simultaneously flexible and extensible to 282 accommodate unforeseen future demands. 284 1.2. Components of Traffic Engineering 286 As mentioned in Section 1.1, Internet Traffic Engineering provides 287 performance optimization of operational IP networks while utilizing 288 network resources economically and reliably. Such optimization is 289 supported at the control/controller level and within the data/ 290 forwarding plane. 292 The key elements required in any TE solution are as follows. Some TE 293 solutions rely on these elements to a lesser or greater extent. 294 Debate remains about whether a solution can truly be called Traffic 295 Engineering that does not include all of these elements. For the 296 sake of this document we assert that all TE solutions must include 297 some aspects of all of these elements. Other solutions can be 298 classed as "partial TE" and also fall in scope of this document. 300 1. Policy 302 2. Path steering 304 3. Resource management 306 Policy allows for the selection of next hops and paths based on 307 information beyond basic reachability. Early definitions of routing 308 policy, e.g., [RFC1102] and [RFC1104], discuss routing policy being 309 applied to restrict access to network resources at an aggregate 310 level. BGP is an example of a commonly used mechanism for applying 311 such policies, see [RFC4271] and [RFC5575]. In the Traffic 312 Engineering context, policy decisions are made within the control 313 plane or by controllers, and govern the selection of paths. Examples 314 of such can be found in [RFC4655] and [RFC5394]. Standard TE 315 solutions may cover the mechanisms to distribute and/or enforce 316 polices, but specific policy definition is generally unspecified. 318 Path steering is the ability to forward packets using information 319 beyond the next hop. Examples of path steering include IPv4 source 320 routes [RFC0791], RSVP-TE explicit routes [RFC3209], and Segment 321 Routing [RFC8402]. Path steering for TE can be supported via control 322 plane protocols or by encoding in the data plane headers or any 323 combination of the two. This includes when control is provided via a 324 controller and some southbound, i.e., controller to router, control 325 protocol. 327 Resource management provides resource aware control and, in some 328 cases, forwarding. Examples of resources are bandwidth, buffers and 329 queues, which in turn can be managed to control loss and latency. 330 Resources reservation is the control aspect of resource management. 331 It provides for network domain-wide consensus on which network 332 (including node and link) resources are to be used by a particular 333 flow. This determination may be done on a very course or very fine 334 level. Note that this consensus exists at the network control or 335 controller level, not the data plane level. It may be purely 336 composed of accounting/bookkeeping. It typically includes an ability 337 to admit, reject or reclassify a flow based on policy. Such 338 accounting can be done based on a static understanding of resource 339 requirements, or using dynamic mechanisms to collect requirements 340 (e.g., via [RFC3209]) and resource availability (e.g., via 341 [RFC4203]), or any combination of the two. 343 Resource allocation is the data plane aspect of resource management. 344 It provides for the allocation of specific node and link resources to 345 specific flows. Example resources include buffers, policing and 346 rate-shaping mechanisms which are typically supported via queuing. 347 It also includes the matching of a flow, i.e., flow classification, 348 to a particular set of allocated resources. The method for flow 349 classification and granularity of resource management is technology 350 specific. Examples include DiffServ with dropping and remarking 351 [RFC4594], MPLS-TE [RFC3209] and GMPLS [RFC3945] based LSPs, and 352 controller-based solutions implementing [RFC8453]. This level of 353 resource control, while optional, is important in networks that wish 354 to support congestion management policies to control or regulate the 355 offered traffic to deliver different levels of service and alleviate 356 congestion problems, or those networks that wish to control latencies 357 experienced by specific traffic flows. 359 1.3. Scope 361 The scope of this document is intra-domain traffic engineering; that 362 is, traffic engineering within a given autonomous system in the 363 Internet. This document will discuss concepts pertaining to intra- 364 domain traffic control, including such issues as routing control, 365 micro and macro resource allocation, and the control coordination 366 problems that arise consequently. 368 This document describes and characterize techniques already in use or 369 in advanced development for Internet traffic engineering. The way 370 these techniques fit together will be discussed and scenarios in 371 which they are useful will be identified. 373 While this document considers various intra-domain traffic 374 engineering approaches, it focuses more on traffic engineering with 375 MPLS and GMPLS. Traffic engineering based upon manipulation of IGP 376 metrics is not addressed in detail. This topic may be addressed by 377 other working group documents. 379 Although the emphasis is on intra-domain traffic engineering, in 380 Section 7, an overview of the high level considerations pertaining to 381 inter-domain traffic engineering will be provided. Inter-domain 382 Internet traffic engineering is crucial to the performance 383 enhancement of the global Internet infrastructure. 385 Whenever possible, relevant requirements from existing IETF documents 386 and other sources will be incorporated by reference. 388 1.4. Terminology 390 This subsection provides terminology which is useful for Internet 391 traffic engineering. The definitions presented apply to this 392 document. These terms may have other meanings elsewhere. 394 Baseline analysis: A study conducted to serve as a baseline for 395 comparison to the actual behavior of the network. 397 Busy hour: A one hour period within a specified interval of time 398 (typically 24 hours) in which the traffic load in a network or 399 sub-network is greatest. 401 Bottleneck: A network element whose input traffic rate tends to be 402 greater than its output rate. 404 Congestion: A state of a network resource in which the traffic 405 incident on the resource exceeds its output capacity over an 406 interval of time. 408 Congestion avoidance: An approach to congestion management that 409 attempts to obviate the occurrence of congestion. 411 Congestion control: An approach to congestion management that 412 attempts to remedy congestion problems that have already occurred. 414 Constraint-based routing: A class of routing protocols that take 415 specified traffic attributes, network constraints, and policy 416 constraints into account when making routing decisions. 417 Constraint-based routing is applicable to traffic aggregates as 418 well as flows. It is a generalization of QoS routing. 420 Demand side congestion management: A congestion management scheme 421 that addresses congestion problems by regulating or conditioning 422 offered load. 424 Effective bandwidth: The minimum amount of bandwidth that can be 425 assigned to a flow or traffic aggregate in order to deliver 426 'acceptable service quality' to the flow or traffic aggregate. 428 Egress traffic: Traffic exiting a network or network element. 430 Hot-spot: A network element or subsystem which is in a state of 431 congestion. 433 Ingress traffic: Traffic entering a network or network element. 435 Inter-domain traffic: Traffic that originates in one Autonomous 436 system and terminates in another. 438 Loss network: A network that does not provide adequate buffering for 439 traffic, so that traffic entering a busy resource within the 440 network will be dropped rather than queued. 442 Metric: A parameter defined in terms of standard units of 443 measurement. 445 Measurement methodology: A repeatable measurement technique used to 446 derive one or more metrics of interest. 448 Network survivability: The capability to provide a prescribed level 449 of QoS for existing services after a given number of failures 450 occur within the network. 452 Offline traffic engineering: A traffic engineering system that 453 exists outside of the network. 455 Online traffic engineering: A traffic engineering system that exists 456 within the network, typically implemented on or as adjuncts to 457 operational network elements. 459 Performance measures: Metrics that provide quantitative or 460 qualitative measures of the performance of systems or subsystems 461 of interest. 463 Performance management: A systematic approach to improving 464 effectiveness in the accomplishment of specific networking goals 465 related to performance improvement. 467 Performance metric: A performance parameter defined in terms of 468 standard units of measurement. 470 Provisioning: The process of assigning or configuring network 471 resources to meet certain requests. 473 QoS routing: Class of routing systems that selects paths to be used 474 by a flow based on the QoS requirements of the flow. 476 Service Level Agreement (SLA): A contract between a provider and a 477 customer that guarantees specific levels of performance and 478 reliability at a certain cost. 480 Service Level Objective (SLO): A key element of an SLA between a 481 provider and a customer. SLOs are agreed upon as a means of 482 measuring the performance of the Service Provider and are outlined 483 as a way of avoiding disputes between the two parties based on 484 misunderstanding. 486 Stability: An operational state in which a network does not 487 oscillate in a disruptive manner from one mode to another mode. 489 Supply-side congestion management: A congestion management scheme 490 that provisions additional network resources to address existing 491 and/or anticipated congestion problems. 493 Transit traffic: Traffic whose origin and destination are both 494 outside of the network under consideration. 496 Traffic characteristic: A description of the temporal behavior or a 497 description of the attributes of a given traffic flow or traffic 498 aggregate. 500 Traffic engineering system: A collection of objects, mechanisms, and 501 protocols that are used conjunctively to accomplish traffic 502 engineering objectives. 504 Traffic flow: A stream of packets between two end-points that can be 505 characterized in a certain way. A micro-flow has a more specific 506 definition A micro-flow is a stream of packets with the same 507 source and destination addresses, source and destination ports, 508 and protocol ID. 510 Traffic intensity: A measure of traffic loading with respect to a 511 resource capacity over a specified period of time. In classical 512 telephony systems, traffic intensity is measured in units of 513 Erlangs. 515 Traffic matrix: A representation of the traffic demand between a set 516 of origin and destination abstract nodes. An abstract node can 517 consist of one or more network elements. 519 Traffic monitoring: The process of observing traffic characteristics 520 at a given point in a network and collecting the traffic 521 information for analysis and further action. 523 Traffic trunk: An aggregation of traffic flows belonging to the same 524 class which are forwarded through a common path. A traffic trunk 525 may be characterized by an ingress and egress node, and a set of 526 attributes which determine its behavioral characteristics and 527 requirements from the network. 529 2. Background 531 The Internet must convey IP packets from ingress nodes to egress 532 nodes efficiently, expeditiously, and economically. Furthermore, in 533 a multiclass service environment (e.g., Diffserv capable networks), 534 the resource sharing parameters of the network must be appropriately 535 determined and configured according to prevailing policies and 536 service models to resolve resource contention issues arising from 537 mutual interference between packets traversing through the network. 538 Thus, consideration must be given to resolving competition for 539 network resources between traffic streams belonging to the same 540 service class (intra-class contention resolution) and traffic streams 541 belonging to different classes (inter-class contention resolution). 543 2.1. Context of Internet Traffic Engineering 545 The context of Internet traffic engineering pertains to the scenarios 546 where traffic engineering is used. A traffic engineering methodology 547 establishes appropriate rules to resolve traffic performance issues 548 occurring in a specific context. The context of Internet traffic 549 engineering includes: 551 1. A network context defining the universe of discourse, and in 552 particular the situations in which the traffic engineering 553 problems occur. The network context includes network structure, 554 network policies, network characteristics, network constraints, 555 network quality attributes, and network optimization criteria. 557 2. A problem context defining the general and concrete issues that 558 traffic engineering addresses. The problem context includes 559 identification, abstraction of relevant features, representation, 560 formulation, specification of the requirements on the solution 561 space, and specification of the desirable features of acceptable 562 solutions. 564 3. A solution context suggesting how to address the issues 565 identified by the problem context. The solution context includes 566 analysis, evaluation of alternatives, prescription, and 567 resolution. 569 4. An implementation and operational context in which the solutions 570 are methodologically instantiated. The implementation and 571 operational context includes planning, organization, and 572 execution. 574 The context of Internet traffic engineering and the different problem 575 scenarios are discussed in the following subsections. 577 2.2. Network Context 579 IP networks range in size from small clusters of routers situated 580 within a given location, to thousands of interconnected routers, 581 switches, and other components distributed all over the world. 583 Conceptually, at the most basic level of abstraction, an IP network 584 can be represented as a distributed dynamical system consisting of: 586 o a set of interconnected resources which provide transport services 587 for IP traffic subject to certain constraints 589 o a demand system representing the offered load to be transported 590 through the network 592 o a response system consisting of network processes, protocols, and 593 related mechanisms which facilitate the movement of traffic 594 through the network (see also [AWD2]). 596 The network elements and resources may have specific characteristics 597 restricting the manner in which the demand is handled. Additionally, 598 network resources may be equipped with traffic control mechanisms 599 superintending the way in which the demand is serviced. Traffic 600 control mechanisms may, for example, be used to: 602 o control various packet processing activities within a given 603 resource 605 o arbitrate contention for access to the resource by different 606 packets 608 o regulate traffic behavior through the resource. 610 A configuration management and provisioning system may allow the 611 settings of the traffic control mechanisms to be manipulated by 612 external or internal entities in order to exercise control over the 613 way in which the network elements respond to internal and external 614 stimuli. 616 The details of how the network provides transport services for 617 packets are specified in the policies of the network administrators 618 and are installed through network configuration management and policy 619 based provisioning systems. Generally, the types of services 620 provided by the network also depends upon the technology and 621 characteristics of the network elements and protocols, the prevailing 622 service and utility models, and the ability of the network 623 administrators to translate policies into network configurations. 625 Contemporary Internet networks have three significant 626 characteristics: 628 o they provide real-time services 630 o they have become mission critical 632 o their operating environments are very dynamic. 634 The dynamic characteristics of IP and IP/MPLS networks can be 635 attributed in part to fluctuations in demand, to the interaction 636 between various network protocols and processes, to the rapid 637 evolution of the infrastructure which demands the constant inclusion 638 of new technologies and new network elements, and to transient and 639 persistent impairments which occur within the system. 641 Packets contend for the use of network resources as they are conveyed 642 through the network. A network resource is considered to be 643 congested if, for an interval of time, the arrival rate of packets 644 exceed the output capacity of the resource. Congestion may result in 645 some of the arrival packets being delayed or even dropped. 647 Congestion increases transit delays, delay variation, packet loss, 648 and reduces the predictability of network services. Clearly, 649 congestion is highly undesirable. 651 Combating congestion at a reasonable cost is a major objective of 652 Internet traffic engineering. 654 Efficient sharing of network resources by multiple traffic streams is 655 a basic operatoinal premise for packet switched networks in general 656 and for the Internet in particular. A fundamental challenge in 657 network operation, especially in a large scale public IP network, is 658 to increase the efficiency of resource utilization while minimizing 659 the possibility of congestion. 661 The Internet will have to function in the presence of different 662 classes of traffic with different service requirements. RFC 2475 663 provides an architecture for Differentiated Services and makes this 664 requirement clear [RFC2475]. The RFC allows packets to be grouped 665 into behavior aggregates such that each aggregate has a common set of 666 behavioral characteristics or a common set of delivery requirements. 668 Delivery requirements of a specific set of packets may be specified 669 explicitly or implicitly. Two of the most important traffic delivery 670 requirements are capacity constraints and QoS constraints. 672 Capacity constraints can be expressed statistically as peak rates, 673 mean rates, burst sizes, or as some deterministic notion of effective 674 bandwidth. QoS requirements can be expressed in terms of: 676 o integrity constraints such as packet loss 678 o in terms of temporal constraints such as timing restrictions for 679 the delivery of each packet (delay) and timing restrictions for 680 the delivery of consecutive packets belonging to the same traffic 681 stream (delayvariation). 683 2.3. Problem Context 685 There are several large problems in association with operating a 686 network described by the simple model of the previous subsection. 687 This subsection analyze the problem context in relation to traffic 688 engineering. 690 The identification, abstraction, representation, and measurement of 691 network features relevant to traffic engineering are significant 692 issues. 694 A particular challenge is to explicitly formulate the problems that 695 traffic engineering attempts to solve. For example: 697 o how to identify the requirements on the solution space 699 o how to specify the desirable features of good solutions 701 o how to actually solve the problems 703 o how to measure and characterize the effectiveness of the 704 solutions. 706 Another class of problems is how to measure and estimate relevant 707 network state parameters. Effective traffic engineering relies on a 708 good estimate of the offered traffic load as well as a view of the 709 underlying topology and associated resource constraints. A network- 710 wide view of the topology is also a must for offline planning. 712 Still another class of problems is how to characterize the state of 713 the network and how to evaluate its performance under a variety of 714 scenarios. The performance evaluation problem is two- fold. One 715 aspect of this problem relates to the evaluation of the system-level 716 performance of the network. The other aspect relates to the 717 evaluation of the resource-level performance, which restricts 718 attention to the performance analysis of individual network 719 resources. 721 In this document, we refer to the system-level characteristics of the 722 network as the "macro-states" and the resource-level characteristics 723 as the "micro-states." The system-level characteristics are also 724 known as the emergent properties of the network. Correspondingly, we 725 shall refer to the traffic engineering schemes dealing with network 726 performance optimization at the systems level as "macro-TE" and the 727 schemes that optimize at the individual resource level as "micro-TE." 728 Under certain circumstances, the system-level performance can be 729 derived from the resource-level performance using appropriate rules 730 of composition, depending upon the particular performance measures of 731 interest. 733 Another fundamental class of problems concerns how to effectively 734 optimize network performance. Performance optimization may entail 735 translating solutions to specific traffic engineering problems into 736 network configurations. Optimization may also entail some degree of 737 resource management control, routing control, and/or capacity 738 augmentation. 740 As noted previously, congestion is an undesirable phenomena in 741 operational networks. Therefore, the next subsection addresses the 742 issue of congestion and its ramifications within the problem context 743 of Internet traffic engineering. 745 2.3.1. Congestion and its Ramifications 747 Congestion is one of the most significant problems in an operational 748 IP context. A network element is said to be congested if it 749 experiences sustained overload over an interval of time. Congestion 750 almost always results in degradation of service quality to end users. 751 Congestion control schemes can include demand-side policies and 752 supply-side policies. Demand-side policies may restrict access to 753 congested resources and/or dynamically regulate the demand to 754 alleviate the overload situation. Supply-side policies may expand or 755 augment network capacity to better accommodate offered traffic. 756 Supply-side policies may also re-allocate network resources by 757 redistributing traffic over the infrastructure. Traffic 758 redistribution and resource re-allocation serve to increase the 759 'effective capacity' seen by the demand. 761 The emphasis of this document is primarily on congestion management 762 schemes falling within the scope of the network, rather than on 763 congestion management systems dependent upon sensitivity and 764 adaptivity from end-systems. That is, the aspects that are 765 considered in this document with respect to congestion management are 766 those solutions that can be provided by control entities operating on 767 the network and by the actions of network administrators and network 768 operations systems. 770 2.4. Solution Context 772 The solution context for Internet traffic engineering involves 773 analysis, evaluation of alternatives, and choice between alternative 774 courses of action. Generally the solution context is based on making 775 reasonable inferences about the current or future state of the 776 network, and subsequently making appropriate decisions that may 777 involve a preference between alternative sets of action. More 778 specifically, the solution context demands reasonable estimates of 779 traffic workload, characterization of network state, deriving 780 solutions to traffic engineering problems which may be implicitly or 781 explicitly formulated, and possibly instantiating a set of control 782 actions. Control actions may involve the manipulation of parameters 783 associated with routing, control over tactical capacity acquisition, 784 and control over the traffic management functions. 786 The following list of instruments may be applicable to the solution 787 context of Internet traffic engineering. 789 o A set of policies, objectives, and requirements (which may be 790 context dependent) for network performance evaluation and 791 performance optimization. 793 o A collection of online and possibly offline tools and mechanisms 794 for measurement, characterization, modeling, and control of 795 Internet traffic and control over the placement and allocation of 796 network resources, as well as control over the mapping or 797 distribution of traffic onto the infrastructure. 799 o A set of constraints on the operating environment, the network 800 protocols, and the traffic engineering system itself. 802 o A set of quantitative and qualitative techniques and methodologies 803 for abstracting, formulating, and solving traffic engineering 804 problems. 806 o A set of administrative control parameters which may be 807 manipulated through a Configuration Management (CM) system. The 808 CM system itself may include a configuration control subsystem, a 809 configuration repository, a configuration accounting subsystem, 810 and a configuration auditing subsystem. 812 o A set of guidelines for network performance evaluation, 813 performance optimization, and performance improvement. 815 Determining traffic characteristics through measurement and/or 816 estimation is very useful within the realm the traffic engineering 817 solution space. Traffic estimates can be derived from customer 818 subscription information, traffic projections, traffic models, and 819 from actual measurements. The measurements may be performed at 820 different levels, e.g. the traffic-aggregate level or at the flow 821 level. Measuring at different levels is done in order to aquire 822 traffic statistics at more or less detail. Measurements at the flow 823 level or on small traffic aggregates may be performed at edge nodes, 824 when traffic enters and leaves the network. Measurements for large 825 traffic-aggregates may be performed within the core of the network. 827 To conduct performance studies and to support planning of existing 828 and future networks, a routing analysis may be performed to determine 829 the paths the routing protocols will choose for various traffic 830 demands, and to ascertain the utilization of network resources as 831 traffic is routed through the network. The routing analysis should 832 capture the selection of paths through the network, the assignment of 833 traffic across multiple feasible routes, and the multiplexing of IP 834 traffic over traffic trunks (if such constructs exists) and over the 835 underlying network infrastructure. A network topology model is a 836 necessity for routing analysis. A network topology model may be 837 extracted from: 839 o network architecture documents 841 o network designs 843 o information contained in router configuration files 845 o routing databases 847 o routing tables 849 o automated tools that discover and depict network topology 850 information. 852 Topology information may also be derived from servers that monitor 853 network state, and from servers that perform provisioning functions. 855 Routing in operational IP networks can be administratively controlled 856 at various levels of abstraction including the manipulation of BGP 857 attributes and IGP metrics. For path oriented technologies such as 858 MPLS, routing can be further controlled by the manipulation of 859 relevant traffic engineering parameters, resource parameters, and 860 administrative policy constraints. Within the context of MPLS, the 861 path of an explicitly routed label switched path (LSP) can be 862 computed and established in various ways including: 864 o manually 866 o automatically online using constraint-based routing processes 867 implemented on label switching routers 869 o automatically offline using constraint-based routing entities 870 implemented on external traffic engineering support systems. 872 2.4.1. Combating the Congestion Problem 874 Minimizing congestion is a significant aspect of Internet traffic 875 engineering. This subsection gives an overview of the general 876 approaches that have been used or proposed to combat congestion 877 problems. 879 Congestion management policies can be categorized based upon the 880 following criteria (see e.g., [YARE95] for a more detailed taxonomy 881 of congestion control schemes): 883 o Response time scale which can be characterized as long, medium, or 884 short 886 o reactive versus preventive which relates to congestion control and 887 congestion avoidance 889 o supply side versus demand side congestion management schemes. 891 These aspects are discussed in the following paragraphs. 893 1. Congestion Management based on Response Time Scales 895 * Long (weeks to months): Expanding network capacity by adding 896 new equipement, routers and links, takes time and is 897 comparatively costly. Capacity planning needs to take this 898 into consideration. Network capacity is expanded based on 899 estimates or forcasts of future traffic development and 900 traffic distribution. These upgrades are typically carried 901 out over weeks or months, or maybe even years. 903 * Medium (minutes to days): Several control policies fall within 904 the medium timescale category. Examples include: 906 a. Adjusting IGP and/or BGP parameters to route traffic away 907 or towards certain segments of the network 909 b. Setting up and/or adjusting some explicitly routed LSPs in 910 MPLS networks to route some traffic trunks away from 911 possibly congested resources or toward possibly more 912 favorable routes 914 c. Re-configuring the logical topology of the network to make 915 it correlate more closely with the spatial traffic 916 distribution using for example some underlying path- 917 oriented technology such as MPLS LSPs or optical channel 918 trails. 920 Many of these adaptive medium time scale response schemes rely 921 on a measurement system. The measurement system monitors 922 changes in traffic distribution, traffic shifts, and network 923 resource utilization. The measurement system then provides 924 feedback to the online and/or offline traffic engineering 925 mechanisms and tools which employ this feedback information to 926 trigger certain control actions to occur within the network. 927 The traffic engineering mechanisms and tools can be 928 implemented in a distributed or centralized fashion, and may 929 have a hierarchical or flat structure. The comparative merits 930 of distributed and centralized control structures for networks 931 are well known. A centralized scheme may have global 932 visibility into the network state and may produce potentially 933 more optimal solutions. However, centralized schemes are 934 prone to single points of failure and may not scale as well as 935 distributed schemes. Moreover, the information utilized by a 936 centralized scheme may be stale and may not reflect the actual 937 state of the network. It is not an objective of this memo to 938 make a recommendation between distributed and centralized 939 schemes. This is a choice that network administrators must 940 make based on their specific needs. 942 * Short (picoseconds to minutes): This category includes packet 943 level processing functions and events on the order of several 944 round trip times. It includes router mechanisms such as 945 passive and active buffer management. These mechanisms are 946 used to control congestion and/or signal congestion to end 947 systems so that they can adaptively regulate the rate at which 948 traffic is injected into the network. One of the most popular 949 active queue management schemes, especially for TCP traffic, 950 is Random Early Detection (RED) [FLJA93]. RED supports 951 congestion avoidance by controlling the average queue size. 952 During congestion (but before the queue is filled), the RED 953 scheme chooses arriving packets to "mark" according to a 954 probabilistic algorithm which takes into account the average 955 queue size. For a router that does not utilize explicit 956 congestion notification (ECN) see e.g., [FLOY94], the marked 957 packets can simply be dropped to signal the inception of 958 congestion to end systems. On the other hand, if the router 959 supports ECN, then it can set the ECN field in the packet 960 header. Several variations of RED have been proposed to 961 support different drop precedence levels in multi-class 962 environments [RFC2597], e.g., RED with In and Out (RIO) and 963 Weighted RED. There is general consensus that RED provides 964 congestion avoidance performance which is not worse than 965 traditional Tail-Drop (TD) queue management (drop arriving 966 packets only when the queue is full). Importantly, however, 967 RED reduces the possibility of global synchronization and 968 improves fairness among different TCP sessions. However, RED 969 by itself can not prevent congestion and unfairness caused by 970 sources unresponsive to RED, e.g., UDP traffic and some 971 misbehaved greedy connections. Other schemes have been 972 proposed to improve the performance and fairness in the 973 presence of unresponsive traffic. Some of these schemes were 974 proposed as theoretical frameworks and are typically not 975 available in existing commercial products. Two such schemes 976 are Longest Queue Drop (LQD) and Dynamic Soft Partitioning 977 with Random Drop (RND) [SLDC98]. 979 2. Congestion Management: Reactive versus Preventive Schemes 981 * Reactive: Reactive (recovery) congestion management policies 982 react to existing congestion problems to improve it. All the 983 policies described in the long and medium time scales above 984 can be categorized as being reactive especially if the 985 policies are based on monitoring and identifying existing 986 congestion problems, and on the initiation of relevant actions 987 to ease a situation. 989 * Preventive: Preventive (predictive/avoidance) policies take 990 proactive action to prevent congestion based on estimates and 991 predictions of future potential congestion problems. Some of 992 the policies described in the long and medium time scales fall 993 into this category. They do not necessarily respond 994 immediately to existing congestion problems. Instead 995 forecasts of traffic demand and workload distribution are 996 considered and action may be taken to prevent potential 997 congestion problems in the future. The schemes described in 998 the short time scale (e.g., RED and its variations, ECN, LQD, 999 and RND) are also used for congestion avoidance since dropping 1000 or marking packets before queues actually overflow would 1001 trigger corresponding TCP sources to slow down. 1003 3. Congestion Management: Supply-Side versus Demand-Side Schemes 1004 * Supply-side: Supply-side congestion management policies 1005 increase the effective capacity available to traffic in order 1006 to control or reduce congestion. This can be accomplished by 1007 increasing capacity. Another way to accomplish this is to 1008 minimize congestion by having a relatively balanced 1009 distribution of traffic over the network. For example, 1010 capacity planning should aim to provide a physical topology 1011 and associated link bandwidths that match estimated traffic 1012 workload and traffic distribution. This may be based on 1013 forecasting and subject to budgetary or other constraints. If 1014 actual traffic distribution does not match the topology 1015 derived from capacity panning, then the traffic can be mapped 1016 onto the existing topology using routing control mechanisms, 1017 using path oriented technologies (e.g., MPLS LSPs and optical 1018 channel trails) to modify the logical topology, or by using 1019 some other load redistribution mechanisms. 1021 * Demand-side: Demand-side congestion management policies 1022 control or regulate the offered traffic to alleviate 1023 congestion problems. For example, some of the short time 1024 scale mechanisms described earlier (such as RED and its 1025 variations, ECN, LQD, and RND) as well as policing and rate- 1026 shaping mechanisms attempt to regulate the offered load in 1027 various ways. Tariffs may also be applied as a demand side 1028 instrument. To date, however, tariffs have not been used as a 1029 means of demand-side congestion management within the 1030 Internet. 1032 In summary, a variety of mechanisms can be used to address congestion 1033 problems in IP networks. These mechanisms may operate at multiple 1034 time-scales and at multiple traffic aggregation levels. 1036 2.5. Implementation and Operational Context 1038 The operational context of Internet traffic engineering is 1039 characterized by constant changes which occur at multiple levels of 1040 abstraction. The implementation context demands effective planning, 1041 organization, and execution. The planning aspects may involve 1042 determining prior sets of actions to achieve desired objectives. 1043 Organizing involves arranging and assigning responsibility to the 1044 various components of the traffic engineering system and coordinating 1045 the activities to accomplish the desired TE objectives. Execution 1046 involves measuring and applying corrective or perfective actions to 1047 attain and maintain desired TE goals. 1049 3. Traffic Engineering Process Models 1051 This section describes a generic process model that captures the 1052 high-level practical aspects of Internet traffic engineering in an 1053 operational context. The process model is described as a sequence of 1054 actions that a traffic engineer, or more generally a traffic 1055 engineering system, must perform to optimize the performance of an 1056 operational network (see also [RFC2702], AWD2]). This process model 1057 may be enacted explicitly or implicitly, by an automaton and/or by a 1058 human. 1060 The traffic engineering process model is iterative [AWD2]. The four 1061 phases of the process model described below are repeated continually. 1063 o Define the relevant control policies that govern the operation of 1064 the network. 1066 o A feedback mechanism involving the acquisition of measurement data 1067 from the operational network. 1069 o Analyze the network state and to characterize traffic workload. 1070 Performance analysis may be proactive and/or reactive. Proactive 1071 performance analysis identifies potential problems that do not 1072 exist, but could manifest in the future. Reactive performance 1073 analysis identifies existing problems, determines their cause 1074 through diagnosis, and valuates alternative approaches to remedy 1075 the problem, if necessary. 1077 o Performance optimization of the network. It involves a decision 1078 process which selects and implements a set of actions from a set 1079 of alternatives. Optimization actions may include the use of 1080 appropriate techniques to either control the offered traffic or to 1081 control the distribution of traffic across the network. 1083 3.1. Components of the Traffic Engineering Process Model 1085 The key components of the traffic engineering process model are: 1087 1. Measurement is crucial to the traffic engineering function. The 1088 operational state of a network can be conclusively determined 1089 only through measurement. Measurement is also critical to the 1090 optimization function because it provides feedback data which is 1091 used by traffic engineering control subsystems. This data is 1092 used to adaptively optimize network performance in response to 1093 events and stimuli originating within and outside the network. 1094 Measurement in support of the TE function can occur at different 1095 levels of abstraction. For example, measurement can be used to 1096 derive packet level characteristics, flow level characteristics, 1097 user or customer level characteristics, traffic aggregate 1098 characteristics, component level characteristics, and network 1099 wide characteristics. 1101 2. Modeling, analysis, and simulation are important aspects of 1102 Internet traffic engineering. Modeling involves constructing an 1103 abstract or physical representation which depicts relevant 1104 traffic characteristics and network attributes. A network model 1105 is an abstract representation of the network which captures 1106 relevant network features, attributes, and characteristic. 1107 Network simulation tools are extremely useful for traffic 1108 engineering. Because of the complexity of realistic quantitative 1109 analysis of network behavior, certain aspects of network 1110 performance studies can only be conducted effectively using 1111 simulation. 1113 3. Network performance optimization involves resolving network 1114 issues by transforming such issues into concepts that enable a 1115 solution, identification of a solution, and implementation of the 1116 solution. Network performance optimization can be corrective or 1117 perfective. In corrective optimization, the goal is to remedy a 1118 problem that has occurred or that is incipient. In perfective 1119 optimization, the goal is to improve network performance even 1120 when explicit problems do not exist and are not anticipated. 1122 4. Review of TE Techniques 1124 This section briefly reviews different traffic engineering approaches 1125 proposed and implemented in telecommunications and computer networks. 1126 The discussion is not intended to be comprehensive. It is primarily 1127 intended to illuminate pre-existing perspectives and prior art 1128 concerning traffic engineering in the Internet and in legacy 1129 telecommunications networks. A historic overview is provided in 1130 Appendix A. 1132 4.1. Overview of IETF Projects Related to Traffic Engineering 1134 This subsection reviews a number of IETF activities pertinent to 1135 Internet traffic engineering. These activities are primarily 1136 intended to evolve the IP architecture to support new service 1137 definitions which allow preferential or differentiated treatment to 1138 be accorded to certain types of traffic. 1140 4.1.1. Constraint-Based Routing 1142 Constraint-based routing refers to a class of routing systems that 1143 compute routes through a network subject to the satisfaction of a set 1144 of constraints and requirements. In the most general setting, 1145 constraint-based routing may also seek to optimize overall network 1146 performance while minimizing costs. 1148 The constraints and requirements may be imposed by the network itself 1149 or by administrative policies. Constraints may include bandwidth, 1150 hop count, delay, and policy instruments such as resource class 1151 attributes. Constraints may also include domain specific attributes 1152 of certain network technologies and contexts which impose 1153 restrictions on the solution space of the routing function. Path 1154 oriented technologies such as MPLS have made constraint-based routing 1155 feasible and attractive in public IP networks. 1157 The concept of constraint-based routing within the context of MPLS 1158 traffic engineering requirements in IP networks was first described 1159 in [RFC2702] and led to developments such as MPLS-TE [RFC3209] as 1160 described in Section 4.1.5. 1162 Unlike QoS routing (for example, see [RFC2386] and [MA]) which 1163 generally addresses the issue of routing individual traffic flows to 1164 satisfy prescribed flow based QoS requirements subject to network 1165 resource availability, constraint-based routing is applicable to 1166 traffic aggregates as well as flows and may be subject to a wide 1167 variety of constraints which may include policy restrictions. 1169 4.1.2. Integrated Services 1171 The IETF Integrated Services working group developed the integrated 1172 services (Intserv) model. This model requires resources, such as 1173 bandwidth and buffers, to be reserved a priori for a given traffic 1174 flow to ensure that the quality of service requested by the traffic 1175 flow is satisfied. The integrated services model includes additional 1176 components beyond those used in the best-effort model such as packet 1177 classifiers, packet schedulers, and admission control. A packet 1178 classifier is used to identify flows that are to receive a certain 1179 level of service. A packet scheduler handles the scheduling of 1180 service to different packet flows to ensure that QoS commitments are 1181 met. Admission control is used to determine whether a router has the 1182 necessary resources to accept a new flow. 1184 The main issue with the Integrated Services model has been 1185 scalability [RFC2998], especially in large public IP networks which 1186 may potentially have millions of active micro-flows in transit 1187 concurrently. 1189 A notable feature of the Integrated Services model is that it 1190 requires explicit signaling of QoS requirements from end systems to 1191 routers [RFC2753]. The Resource Reservation Protocol (RSVP) performs 1192 this signaling function and is a critical component of the Integrated 1193 Services model. RSVP is described next. 1195 4.1.3. RSVP 1197 RSVP is a soft state signaling protocol [RFC2205]. It supports 1198 receiver initiated establishment of resource reservations for both 1199 multicast and unicast flows. RSVP was originally developed as a 1200 signaling protocol within the integrated services framework for 1201 applications to communicate QoS requirements to the network and for 1202 the network to reserve relevant resources to satisfy the QoS 1203 requirements [RFC2205]. 1205 Under RSVP, the sender or source node sends a PATH message to the 1206 receiver with the same source and destination addresses as the 1207 traffic which the sender will generate. The PATH message contains: 1208 (1) a sender Tspec specifying the characteristics of the traffic, (2) 1209 a sender Template specifying the format of the traffic, and (3) an 1210 optional Adspec which is used to support the concept of One Pass With 1211 Advertising (OPWA) [RFC2205]. Every intermediate router along the 1212 path forwards the PATH Message to the next hop determined by the 1213 routing protocol. Upon receiving a PATH Message, the receiver 1214 responds with a RESV message which includes a flow descriptor used to 1215 request resource reservations. The RESV message travels to the 1216 sender or source node in the opposite direction along the path that 1217 the PATH message traversed. Every intermediate router along the path 1218 can reject or accept the reservation request of the RESV message. If 1219 the request is rejected, the rejecting router will send an error 1220 message to the receiver and the signaling process will terminate. If 1221 the request is accepted, link bandwidth and buffer space are 1222 allocated for the flow and the related flow state information is 1223 installed in the router. 1225 One of the issues with the original RSVP specification was 1226 Scalability. This is because reservations were required for micro- 1227 flows, so that the amount of state maintained by network elements 1228 tends to increase linearly with the number of micro-flows. These 1229 issues are described in [RFC2961]. 1231 Recently, RSVP has been modified and extended in several ways to 1232 mitigate the scaling problems. As a result, it is becoming a 1233 versatile signaling protocol for the Internet. For example, RSVP has 1234 been extended to reserve resources for aggregation of flows, to set 1235 up MPLS explicit label switched paths, and to perform other signaling 1236 functions within the Internet. There are also a number of proposals 1237 to reduce the amount of refresh messages required to maintain 1238 established RSVP sessions [RFC2961]. 1240 A number of IETF working groups have been engaged in activities 1241 related to the RSVP protocol. These include the original RSVP 1242 working group, the MPLS working group, the CCAMP working group, the 1243 TEAS working group, the Resource Allocation Protocol working group, 1244 and the Policy Framework working group. 1246 4.1.4. Differentiated Services 1248 The goal of the Differentiated Services (Diffserv) effort within the 1249 IETF is to devise scalable mechanisms for categorization of traffic 1250 into behavior aggregates, which ultimately allows each behavior 1251 aggregate to be treated differently, especially when there is a 1252 shortage of resources such as link bandwidth and buffer space 1253 [RFC2475]. One of the primary motivations for the Diffserv effort 1254 was to devise alternative mechanisms for service differentiation in 1255 the Internet that mitigate the scalability issues encountered with 1256 the Intserv model. 1258 The IETF Diffserv working group has defined a Differentiated Services 1259 field in the IP header (DS field). The DS field consists of six bits 1260 of the part of the IP header formerly known as the TOS octet. The DS 1261 field is used to indicate the forwarding treatment that a packet 1262 should receive at a node [RFC2474]. The Diffserv working group has 1263 also standardized a number of Per-Hop Behavior (PHB) groups. Using 1264 the PHBs, several classes of services can be defined using different 1265 classification, policing, shaping, and scheduling rules. 1267 For an end-user of network services to receive Differentiated 1268 Services from its Internet Service Provider (ISP), it may be 1269 necessary for the user to have a Service Level Agreement (SLA) with 1270 the ISP. An SLA may explicitly or implicitly specify a Traffic 1271 Conditioning Agreement (TCA) which defines classifier rules as well 1272 as metering, marking, discarding, and shaping rules. 1274 Packets are classified, and possibly policed and shaped at the 1275 ingress to a Diffserv network. When a packet traverses the boundary 1276 between different Diffserv domains, the DS field of the packet may be 1277 re-marked according to existing agreements between the domains. 1279 Differentiated Services allows only a finite number of service 1280 classes to be specified by the DS field. The main advantage of the 1281 Diffserv approach relative to the Intserv model is scalability. 1282 Resources are allocated on a per-class basis and the amount of state 1283 information is proportional to the number of classes rather than to 1284 the number of application flows. 1286 It should be obvious from the previous discussion that the Diffserv 1287 model essentially deals with traffic management issues on a per hop 1288 basis. The Diffserv control model consists of a collection of micro- 1289 TE control mechanisms. Other traffic engineering capabilities, such 1290 as capacity management (including routing control), are also required 1291 in order to deliver acceptable service quality in Diffserv networks. 1292 The concept of Per Domain Behaviors has been introduced to better 1293 capture the notion of differentiated services across a complete 1294 domain [RFC3086]. 1296 4.1.5. MPLS 1298 MPLS is an advanced forwarding scheme which also includes extensions 1299 to conventional IP control plane protocols. MPLS extends the 1300 Internet routing model and enhances packet forwarding and path 1301 control [RFC3031]. 1303 At the ingress to an MPLS domain, Label Switching Routers (LSRs) 1304 classify IP packets into Forwarding Equivalence Classes (FECs) based 1305 on a variety of factors, including, e.g., a combination of the 1306 information carried in the IP header of the packets and the local 1307 routing information maintained by the LSRs. An MPLS label stack 1308 entry is then prepended to each packet according to their forwarding 1309 equivalence classes. The MPLS label stack entry is 32 bits long and 1310 contains a 20-bit label field. 1312 An LSR makes forwarding decisions by using the label prepended to 1313 packets as the index into a local next hop label forwarding entry 1314 (NHLFE). The packet is then processed as specified in the NHLFE. 1315 The incoming label may be replaced by an outgoing label (label swap), 1316 and the packet may be forwarded to the next LSR. Before a packet 1317 leaves an MPLS domain, its MPLS label may be removed (label pop). A 1318 Label Switched Path (LSP) is the path between an ingress LSRs and an 1319 egress LSRs through which a labeled packet traverses. The path of an 1320 explicit LSP is defined at the originating (ingress) node of the LSP. 1321 MPLS can use a signaling protocol such as RSVP or LDP to set up LSPs. 1323 MPLS is a very powerful technology for Internet traffic engineering 1324 because it supports explicit LSPs which allow constraint-based 1325 routing to be implemented efficiently in IP networks [AWD2]. The 1326 requirements for traffic engineering over MPLS are described in 1327 [RFC2702]. Extensions to RSVP to support instantiation of explicit 1328 LSP are discussed in [RFC3209]. 1330 4.1.6. Generalized MPLS 1332 GMPLS extends MPLS control protocols to encompass time-division 1333 (e.g., SONET/SDH, PDH, G.709), wavelength (lambdas), and spatial 1334 switching (e.g., incoming port or fiber to outgoing port or fiber) as 1335 well as continuing to support packet switching. GMPLS provides a 1336 common set of control protocols for all of these layers (including 1337 some technology-specific extensions) each of which has a diverse data 1338 or forwarding plane. GMPLS covers both the signaling and the routing 1339 part of that control plane and is based on the Traffic Engineering 1340 extensions to MPLS (see Section 4.1.5). 1342 In GMPLS, the original MPLS architecture is extended to include LSRs 1343 whose forwarding planes rely on circuit switching, and therefore 1344 cannot forward data based on the information carried in either packet 1345 or cell headers. Specifically, such LSRs include devices where the 1346 switching is based on time slots, wavelengths, or physical ports. 1347 These additions impact basic LSP properties: how labels are requested 1348 and communicated, the unidirectional nature of MPLS LSPs, how errors 1349 are propagated, and information provided for synchronizing the 1350 ingress and egress LSRs. 1352 4.1.7. IP Performance Metrics 1354 The IETF IP Performance Metrics (IPPM) working group has been 1355 developing a set of standard metrics that can be used to monitor the 1356 quality, performance, and reliability of Internet services. These 1357 metrics can be applied by network operators, end-users, and 1358 independent testing groups to provide users and service providers 1359 with a common understanding of the performance and reliability of the 1360 Internet component 'clouds' they use/provide [RFC2330]. The criteria 1361 for performance metrics developed by the IPPM WG are described in 1362 [RFC2330]. Examples of performance metrics include one-way packet 1363 loss [RFC7680], one-way delay [RFC7679], and connectivity measures 1364 between two nodes [RFC2678]. Other metrics include second-order 1365 measures of packet loss and delay. 1367 Some of the performance metrics specified by the IPPM WG are useful 1368 for specifying Service Level Agreements (SLAs). SLAs are sets of 1369 service level objectives negotiated between users and service 1370 providers, wherein each objective is a combination of one or more 1371 performance metrics, possibly subject to certain constraints. 1373 4.1.8. Flow Measurement 1375 The IETF Real Time Flow Measurement (RTFM) working group has produced 1376 an architecture document defining a method to specify traffic flows 1377 as well as a number of components for flow measurement (meters, meter 1378 readers, manager) [RFC2722]. A flow measurement system enables 1379 network traffic flows to be measured and analyzed at the flow level 1380 for a variety of purposes. As noted in RFC 2722, a flow measurement 1381 system can be very useful in the following contexts: 1383 o understanding the behavior of existing networks 1384 o planning for network development and expansion 1386 o quantification of network performance 1388 o verifying the quality of network service 1390 o attribution of network usage to users. 1392 A flow measurement system consists of meters, meter readers, and 1393 managers. A meter observes packets passing through a measurement 1394 point, classifies them into certain groups, accumulates certain usage 1395 data (such as the number of packets and bytes for each group), and 1396 stores the usage data in a flow table. A group may represent a user 1397 application, a host, a network, a group of networks, etc. A meter 1398 reader gathers usage data from various meters so it can be made 1399 available for analysis. A manager is responsible for configuring and 1400 controlling meters and meter readers. The instructions received by a 1401 meter from a manager include flow specification, meter control 1402 parameters, and sampling techniques. The instructions received by a 1403 meter reader from a manager include the address of the meter whose 1404 date is to be collected, the frequency of data collection, and the 1405 types of flows to be collected. 1407 4.1.9. Endpoint Congestion Management 1409 [RFC3124] is intended to provide a set of congestion control 1410 mechanisms that transport protocols can use. It is also intended to 1411 develop mechanisms for unifying congestion control across a subset of 1412 an endpoint's active unicast connections (called a congestion group). 1413 A congestion manager continuously monitors the state of the path for 1414 each congestion group under its control. The manager uses that 1415 information to instruct a scheduler on how to partition bandwidth 1416 among the connections of that congestion group. 1418 4.1.10. TE Extensions to the IGPs 1420 TBD 1422 4.1.11. Link-State BGP 1424 In a number of environments, a component external to a network is 1425 called upon to perform computations based on the network topology and 1426 current state of the connections within the network, including 1427 traffic engineering information. This is information typically 1428 distributed by IGP routing protocols within the network (see 1429 Section 4.1.10. 1431 The Border Gateway Protocol (BGP) Section 7 is one of the essential 1432 routing protocols that glue the Internet together. BGP Link State 1433 (BGP-LS) [RFC7752] is a mechanism by which link-state and traffic 1434 engineering information can be collected from networks and shared 1435 with external components using the BGP routing protocol. The 1436 mechanism is applicable to physical and virtual IGP links, and is 1437 subject to policy control. 1439 Information collected by BGP-LS can be used to construct the Traffic 1440 Engineering Database (TED, see Section 4.1.17) for use by the Path 1441 Computation Element (PCE, see Section 4.1.12), or may be used by 1442 Application-Layer Traffic Optimization (ALTO) servers (see 1443 Section 4.1.13). 1445 4.1.12. Path Computation Element 1447 Constraint-based path computation is a fundamental building block for 1448 traffic engineering in MPLS and GMPLS networks. Path computation in 1449 large, multi-domain networks is complex and may require special 1450 computational components and cooperation between the elements in 1451 different domains. The Path Computation Element (PCE) [RFC4655] is 1452 an entity (component, application, or network node) that is capable 1453 of computing a network path or route based on a network graph and 1454 applying computational constraints. 1456 Thus, a PCE can provide a central component in a traffic engineering 1457 system operating on the Traffic Engineering Database (TED, see 1458 Section 4.1.17) with delegated responsibility for determining paths 1459 in MPLS, GMPLS, or Segment Routing networks. The PCE uses the Path 1460 Computation Element Communication Protocol (PCEP) [RFC5440] to 1461 communicate with Path Computation Clients (PCCs), such as MPLS LSRs, 1462 to answer their requests for computed paths or to instruct them to 1463 initiate new paths [RFC8281] and maintain state about paths already 1464 installed in the network [RFC8231]. 1466 PCEs form key components of a number of traffic engineering systems, 1467 such as the Application of the Path Computation Element Architecture 1468 [RFC6805], the Applicability of a Stateful Path Computation Element 1469 [RFC8051], Abstraction and Control of TE Networks (ACTN) 1470 Section 4.1.15, Centralized Network Control [RFC8283], and Software 1471 Defined Networking (SDN) Section 5.3.2. 1473 4.1.13. Application-Layer Traffic Optimization 1475 TBD 1477 4.1.14. Segment Routing with MPLS encapsuation (SR-MPLS) 1479 Segment Routing (SR) leverages the source routing and tunneling 1480 paradigms: The path packet takes is defined at the ingress and 1481 tunneled to the egress. 1483 A node steers a packet through a controlled set of instructions, 1484 called segments, by prepending the packet with an SR header, label 1485 stack in MPLS case. 1487 A segment can represent any instruction, topological or service- 1488 based, thanks to the MPLS architecture [RFC3031]. Labels cand be 1489 looked up in a global context (platform wide) as well as in some 1490 other context (see "context labels" in section 3 of [RFC5331]). 1492 4.1.14.1. Base Segment Routing Identifier Types 1494 Segments are identified by Segment Identifiers (SIDs). There are 1495 four types of SID that are relevant for traffic engineering. 1497 Prefix SID: Uses SR Global Block (SRGB), must be unique within the 1498 routing domain SRGB, and is advertised by an IGP. The Prefix-SID 1499 can be configured as an absolute value or an index. 1501 Node SID: A Node SID is a prefix SID with the 'N' (node) bit set, it 1502 is associated with a host prefix (/32 or /128) that identifies the 1503 node. More than 1 Node SID can be configured per node. 1505 Adjacency SID: An Adjacency SID is locally significant (by default). 1506 It can be made globally significant through use of the 'L' flag. 1507 It identifies unidirectional adjacency. In most implementations 1508 Adjacency SIDs are automatically allocated for each adjacency. 1509 They are always encoded as an absolute (not indexed) value. 1511 Binding SID: A Binding SID has two purposes 1513 1. Mapping Server in ISIS 1515 ISIS:The SID/Label Binding TLV is used to advertise 1516 prefixes to SID/Label mappings. This functionality is 1517 called the Segment Routing Mapping Server (SRMS). The 1518 behavior of the SRMS is defined in [RFC8661] 1520 2. Cross-connect (label to FEC mapping) 1522 This is fundamental for multi-domain/multi-layer operation. 1523 The Binding SID identifies a new (could be SR or 1524 hierarchical, at another OSI Layer) path available at the 1525 anchor point. Is always local to the originator (must not 1526 be at the top of the stack), must be looked up in the 1527 context of the nodal SID. It could be provisioned through 1528 Netconf/Restconf, PCEP, BGP, or the CLI. 1530 4.1.15. Network Virtualization and Abstraction 1532 One of the main drivers for Software-Defined Networking (SDN) 1533 [RFC7149] is a decoupling of the network control plane from the data 1534 plane. This separation has been achieved for TE networks with the 1535 development of MPLS/GMPLS [RFC3945] and the Path Computation Element 1536 (PCE) [RFC4655]. One of the advantages of SDN is its logically 1537 centralized control regime that allows a global view of the 1538 underlying networks. Centralized control in SDN helps improve 1539 network resource utilization compared with distributed network 1540 control. 1542 Abstraction and Control of TE networks (ACTN) [RFC8453] defines an 1543 hierarchical SDN architecture which describes the functional entities 1544 and methods for the coordination of resources across multiple 1545 domains, to provide end-to-end traffic engineered services. ACTN 1546 facilitates end-to-end connections and provides them to the user. 1547 ACTN is focused on aspects like abstraction, virtualization and 1548 presentation. In particular it deals with: 1550 o Abstraction of the underlying network resources and how they are 1551 provided to higher-layer applications and customers. 1553 o Virtualization of underlying resources, whose selection criterion 1554 is the allocation of those resources for the customer, 1555 application, or service. The creation of a virtualized 1556 environment allowis operators to view and control multi-domain 1557 networks as a single virtualized network. 1559 o Presentation to customers of networks as a virtual network via 1560 open and programmable interfaces. 1562 The ACTN managed infrastructure are traffic engineered network 1563 resources, which may include statistical packet bandwidth, physical 1564 forwarding plane sources (such as wavelengths and time slots), 1565 forwarding and cross connect capabilities. The ACTN type of network 1566 virtualization provides customers and applications (tenants) to 1567 utilise and independently control allocated virtual network resources 1568 as if resources as if they were physically their own resource. The 1569 ACTN network is "sliced", with tenants being given a different 1570 partial and abstracted topology view of the physical underlying 1571 network. 1573 4.1.16. Deterministic Networking 1575 TBD 1577 4.1.17. Network TE State Definition and Presentation 1579 The network states that are relevant to the traffic engineering need 1580 to be stored in the system and presented to the user. The Traffic 1581 Engineering Database (TED) is a collection of all TE information 1582 about all TE nodes and TE links in the network, which is an essential 1583 component of a TE system, such as MPLS-TE [RFC2702] and GMPLS 1584 [RFC3945]. In order to formally define the data in the TED and to 1585 present the data to the user with high usability, the data modeling 1586 language YANG [RFC7950] can be used as described in 1587 [I-D.ietf-teas-yang-te-topo]. 1589 4.1.18. System Management and Control Interfaces 1591 The traffic engineering control system needs to have a management 1592 interface that is human-friendly and a control interfaces that is 1593 programable for automation. The Network Configuration Protocol 1594 (NETCONF) [RFC6241] or the RESTCONF Protocol [RFC8040] provide 1595 programmable interfaces that are also human-friendly. These 1596 protocols use XML or JSON encoded messages. When message compactness 1597 or protocol bandwidth consumption needs to be optimized for the 1598 control interface, other protocols, such as Group Communication for 1599 the Constrained Application Protocol (CoAP) [RFC7390] or gRPC, are 1600 available, especially when the protocol messages are encoded in a 1601 binary format. Along with any of these protocols, the data modeling 1602 language YANG [RFC7950] can be used to formally and precisely define 1603 the interface data. 1605 The Path Computation Element Communication Protocol (PCEP) [RFC5440] 1606 is another protocol that has evolved to be an option for the TE 1607 system control interface. The messages of PCEP are TLV-based, not 1608 defined by a data modeling language such as YANG. 1610 4.2. Content Distribution 1612 The Internet is dominated by client-server interactions, especially 1613 Web traffic (in the future, more sophisticated media servers may 1614 become dominant). The location and performance of major information 1615 servers has a significant impact on the traffic patterns within the 1616 Internet as well as on the perception of service quality by end 1617 users. 1619 A number of dynamic load balancing techniques have been devised to 1620 improve the performance of replicated information servers. These 1621 techniques can cause spatial traffic characteristics to become more 1622 dynamic in the Internet because information servers can be 1623 dynamically picked based upon the location of the clients, the 1624 location of the servers, the relative utilization of the servers, the 1625 relative performance of different networks, and the relative 1626 performance of different parts of a network. This process of 1627 assignment of distributed servers to clients is called Traffic 1628 Directing. It is an application layer function. 1630 Traffic Directing schemes that allocate servers in multiple 1631 geographically dispersed locations to clients may require empirical 1632 network performance statistics to make more effective decisions. In 1633 the future, network measurement systems may need to provide this type 1634 of information. The exact parameters needed are not yet defined. 1636 When congestion exists in the network, Traffic Directing and Traffic 1637 Engineering systems should act in a coordinated manner. This topic 1638 is for further study. 1640 The issues related to location and replication of information 1641 servers, particularly web servers, are important for Internet traffic 1642 engineering because these servers contribute a substantial proportion 1643 of Internet traffic. 1645 5. Taxonomy of Traffic Engineering Systems 1647 This section presents a short taxonomy of traffic engineering 1648 systems. A taxonomy of traffic engineering systems can be 1649 constructed based on traffic engineering styles and views as listed 1650 below: 1652 o Time-dependent vs State-dependent vs Event-dependent 1654 o Offline vs Online 1656 o Centralized vs Distributed 1658 o Local vs Global Information 1660 o Prescriptive vs Descriptive 1662 o Open Loop vs Closed Loop 1664 o Tactical vs Strategic 1666 These classification systems are described in greater detail in the 1667 following subsections of this document. 1669 5.1. Time-Dependent Versus State-Dependent Versus Event Dependent 1671 Traffic engineering methodologies can be classified as time- 1672 dependent, or state-dependent, or event-dependent. All TE schemes 1673 are considered to be dynamic in this document. Static TE implies 1674 that no traffic engineering methodology or algorithm is being 1675 applied. 1677 In the time-dependent TE, historical information based on periodic 1678 variations in traffic, (such as time of day), is used to pre-program 1679 routing plans and other TE control mechanisms. Additionally, 1680 customer subscription or traffic projection may be used. Pre- 1681 programmed routing plans typically change on a relatively long time 1682 scale (e.g., diurnal). Time-dependent algorithms do not attempt to 1683 adapt to random variations in traffic or changing network conditions. 1684 An example of a time-dependent algorithm is a global centralized 1685 optimizer where the input to the system is a traffic matrix and 1686 multi-class QoS requirements as described [MR99]. Another example of 1687 such a methodology is the application of data mining to Internet 1688 traffic [AJ19]. Data mining enables the use of various machine 1689 learning algorithms to identify patterns within historically 1690 collected datasets about Internet traffic, and to extract information 1691 in order to guide decision-making, and to improve efficiency and 1692 productivity of operational processes. 1694 State-dependent TE adapts the routing plans for packets based on the 1695 current state of the network. The current state of the network 1696 provides additional information on variations in actual traffic 1697 (i.e., perturbations from regular variations) that could not be 1698 predicted using historical information. Constraint-based routing is 1699 an example of state-dependent TE operating in a relatively long time 1700 scale. An example operating in a relatively short timescale is a 1701 load-balancing algorithm described in [MATE]. 1703 The state of the network can be based on parameters such as 1704 utilization, packet delay, packet loss, etc. These parameters can be 1705 obtained in several ways. For example, each router may flood these 1706 parameters periodically or by means of some kind of trigger to other 1707 routers. Another approach is for a particular router performing 1708 adaptive TE to send probe packets along a path to gather the state of 1709 that path. [RFC6374] defines protocol extensions to collect 1710 performance measurements from MPLS networks. Another approach is for 1711 a management system to gather the relevant information directly from 1712 network elements using telemetry data collection "publication/ 1713 subscription" techniques [RFC7923]. 1715 Expeditious and accurate gathering and distribution of state 1716 information is critical for adaptive TE due to the dynamic nature of 1717 network conditions. State-dependent algorithms may be applied to 1718 increase network efficiency and resilience. Time-dependent 1719 algorithms are more suitable for predictable traffic variations. On 1720 the other hand, state-dependent algorithms are more suitable for 1721 adapting to the prevailing network state. 1723 Event-dependent TE methods can also be used for TE path selection. 1724 Event-dependent TE methods are distinct from time-dependent and 1725 state-dependent TE methods in the manner in which paths are selected. 1726 These algorithms are adaptive and distributed in nature and typically 1727 use learning models to find good paths for TE in a network. While 1728 state-dependent TE models typically use available-link-bandwidth 1729 (ALB) flooding for TE path selection, event-dependent TE methods do 1730 not require ALB flooding. Rather, event-dependent TE methods 1731 typically search out capacity by learning models, as in the success- 1732 to-the-top (STT) method. ALB flooding can be resource intensive, 1733 since it requires link bandwidth to carry LSAs, processor capacity to 1734 process LSAs, and the overhead can limit area/Autonomous System (AS) 1735 size. Modeling results suggest that event-dependent TE methods could 1736 lead to a reduction in ALB flooding overhead without loss of network 1737 throughput performance [I-D.ietf-tewg-qos-routing]. 1739 5.2. Offline Versus Online 1741 Traffic engineering requires the computation of routing plans. The 1742 computation may be performed offline or online. The computation can 1743 be done offline for scenarios where routing plans need not be 1744 executed in real-time. For example, routing plans computed from 1745 forecast information may be computed offline. Typically, offline 1746 computation is also used to perform extensive searches on multi- 1747 dimensional solution spaces. 1749 Online computation is required when the routing plans must adapt to 1750 changing network conditions as in state-dependent algorithms. Unlike 1751 offline computation (which can be computationally demanding), online 1752 computation is geared toward relative simple and fast calculations to 1753 select routes, fine-tune the allocations of resources, and perform 1754 load balancing. 1756 5.3. Centralized Versus Distributed 1758 Centralized control has a central authority which determines routing 1759 plans and perhaps other TE control parameters on behalf of each 1760 router. The central authority collects the network-state information 1761 from all routers periodically and returns the routing information to 1762 the routers. The routing update cycle is a critical parameter 1763 directly impacting the performance of the network being controlled. 1765 Centralized control may need high processing power and high bandwidth 1766 control channels. 1768 Distributed control determines route selection by each router 1769 autonomously based on the routers view of the state of the network. 1770 The network state information may be obtained by the router using a 1771 probing method or distributed by other routers on a periodic basis 1772 using link state advertisements. Network state information may also 1773 be disseminated under exceptional conditions. Examples of protocol 1774 extensions used to advertise network link state information are 1775 defined in [RFC5305], [RFC6119], [RFC7471], [RFC7810], and [RFC8571]. 1777 5.3.1. Hybrid Systems 1779 TBD 1781 5.3.2. Considerations for Software Defined Networking 1783 TBD 1785 5.4. Local Versus Global 1787 Traffic engineering algorithms may require local or global network- 1788 state information. 1790 Local information pertains to the state of a portion of the domain. 1791 Examples include the bandwidth and packet loss rate of a particular 1792 path. Local state information may be sufficient for certain 1793 instances of distributed-controlled TEs. 1795 Global information pertains to the state of the entire domain 1796 undergoing traffic engineering. Examples include a global traffic 1797 matrix and loading information on each link throughout the domain of 1798 interest. Global state information is typically required with 1799 centralized control. Distributed TE systems may also need global 1800 information in some cases. 1802 5.5. Prescriptive Versus Descriptive 1804 TE systems may also be classified as prescriptive or descriptive. 1806 Prescriptive traffic engineering evaluates alternatives and 1807 recommends a course of action. Prescriptive traffic engineering can 1808 be further categorized as either corrective or perfective. 1809 Corrective TE prescribes a course of action to address an existing or 1810 predicted anomaly. Perfective TE prescribes a course of action to 1811 evolve and improve network performance even when no anomalies are 1812 evident. 1814 Descriptive traffic engineering, on the other hand, characterizes the 1815 state of the network and assesses the impact of various policies 1816 without recommending any particular course of action. 1818 5.5.1. Intent-Based Networking 1820 TBD 1822 5.6. Open-Loop Versus Closed-Loop 1824 Open-loop traffic engineering control is where control action does 1825 not use feedback information from the current network state. The 1826 control action may use its own local information for accounting 1827 purposes, however. 1829 Closed-loop traffic engineering control is where control action 1830 utilizes feedback information from the network state. The feedback 1831 information may be in the form of historical information or current 1832 measurement. 1834 5.7. Tactical vs Strategic 1836 Tactical traffic engineering aims to address specific performance 1837 problems (such as hot-spots) that occur in the network from a 1838 tactical perspective, without consideration of overall strategic 1839 imperatives. Without proper planning and insights, tactical TE tends 1840 to be ad hoc in nature. 1842 Strategic traffic engineering approaches the TE problem from a more 1843 organized and systematic perspective, taking into consideration the 1844 immediate and longer term consequences of specific policies and 1845 actions. 1847 6. Recommendations for Internet Traffic Engineering 1849 This section describes high-level recommendations for traffic 1850 engineering in the Internet. These recommendations are presented in 1851 general terms. 1853 The recommendations describe the capabilities needed to solve a 1854 traffic engineering problem or to achieve a traffic engineering 1855 objective. Broadly speaking, these recommendations can be 1856 categorized as either functional or non-functional recommendations. 1858 Functional recommendations for Internet traffic engineering describe 1859 the functions that a traffic engineering system should perform. 1860 These functions are needed to realize traffic engineering objectives 1861 by addressing traffic engineering problems. 1863 Non-functional recommendations for Internet traffic engineering 1864 relate to the quality attributes or state characteristics of a 1865 traffic engineering system. These recommendations may contain 1866 conflicting assertions and may sometimes be difficult to quantify 1867 precisely. 1869 6.1. Generic Non-functional Recommendations 1871 The generic non-functional recommendations for Internet traffic 1872 engineering include: usability, automation, scalability, stability, 1873 visibility, simplicity, efficiency, reliability, correctness, 1874 maintainability, extensibility, interoperability, and security. In a 1875 given context, some of these recommendations may be critical while 1876 others may be optional. Therefore, prioritization may be required 1877 during the development phase of a traffic engineering system (or 1878 components thereof) to tailor it to a specific operational context. 1880 In the following paragraphs, some of the aspects of the non- 1881 functional recommendations for Internet traffic engineering are 1882 summarized. 1884 Usability: Usability is a human factor aspect of traffic engineering 1885 systems. Usability refers to the ease with which a traffic 1886 engineering system can be deployed and operated. In general, it is 1887 desirable to have a TE system that can be readily deployed in an 1888 existing network. It is also desirable to have a TE system that is 1889 easy to operate and maintain. 1891 Automation: Whenever feasible, a traffic engineering system should 1892 automate as many traffic engineering functions as possible to 1893 minimize the amount of human effort needed to control and analyze 1894 operational networks. Automation is particularly imperative in large 1895 scale public networks because of the high cost of the human aspects 1896 of network operations and the high risk of network problems caused by 1897 human errors. Automation may entail the incorporation of automatic 1898 feedback and intelligence into some components of the traffic 1899 engineering system. 1901 Scalability: Contemporary public networks are growing very fast with 1902 respect to network size and traffic volume. Therefore, a TE system 1903 should be scalable to remain applicable as the network evolves. In 1904 particular, a TE system should remain functional as the network 1905 expands with regard to the number of routers and links, and with 1906 respect to the traffic volume. A TE system should have a scalable 1907 architecture, should not adversely impair other functions and 1908 processes in a network element, and should not consume too much 1909 network resources when collecting and distributing state information 1910 or when exerting control. 1912 Stability: Stability is a very important consideration in traffic 1913 engineering systems that respond to changes in the state of the 1914 network. State-dependent traffic engineering methodologies typically 1915 mandate a tradeoff between responsiveness and stability. It is 1916 strongly recommended that when tradeoffs are warranted between 1917 responsiveness and stability, that the tradeoff should be made in 1918 favor of stability (especially in public IP backbone networks). 1920 Flexibility: A TE system should be flexible to allow for changes in 1921 optimization policy. In particular, a TE system should provide 1922 sufficient configuration options so that a network administrator can 1923 tailor the TE system to a particular environment. It may also be 1924 desirable to have both online and offline TE subsystems which can be 1925 independently enabled and disabled. TE systems that are used in 1926 multi-class networks should also have options to support class based 1927 performance evaluation and optimization. 1929 Visibility: As part of the TE system, mechanisms should exist to 1930 collect statistics from the network and to analyze these statistics 1931 to determine how well the network is functioning. Derived statistics 1932 such as traffic matrices, link utilization, latency, packet loss, and 1933 other performance measures of interest which are determined from 1934 network measurements can be used as indicators of prevailing network 1935 conditions. Other examples of status information which should be 1936 observed include existing functional routing information 1937 (additionally, in the context of MPLS existing LSP routes), etc. 1939 Simplicity: Generally, a TE system should be as simple as possible. 1940 More importantly, the TE system should be relatively easy to use 1941 (i.e., clean, convenient, and intuitive user interfaces). Simplicity 1942 in user interface does not necessarily imply that the TE system will 1943 use naive algorithms. When complex algorithms and internal 1944 structures are used, such complexities should be hidden as much as 1945 possible from the network administrator through the user interface. 1947 Interoperability: Whenever feasible, traffic engineering systems and 1948 their components should be developed with open standards based 1949 interfaces to allow interoperation with other systems and components. 1951 Security: Security is a critical consideration in traffic engineering 1952 systems. Such traffic engineering systems typically exert control 1953 over certain functional aspects of the network to achieve the desired 1954 performance objectives. Therefore, adequate measures must be taken 1955 to safeguard the integrity of the traffic engineering system. 1956 Adequate measures must also be taken to protect the network from 1957 vulnerabilities that originate from security breaches and other 1958 impairments within the traffic engineering system. 1960 The remainder of this section will focus on some of the high-level 1961 functional recommendations for traffic engineering. 1963 6.2. Routing Recommendations 1965 Routing control is a significant aspect of Internet traffic 1966 engineering. Routing impacts many of the key performance measures 1967 associated with networks, such as throughput, delay, and utilization. 1968 Generally, it is very difficult to provide good service quality in a 1969 wide area network without effective routing control. A desirable 1970 routing system is one that takes traffic characteristics and network 1971 constraints into account during route selection while maintaining 1972 stability. 1974 Traditional shortest path first (SPF) interior gateway protocols are 1975 based on shortest path algorithms and have limited control 1976 capabilities for traffic engineering [RFC2702], [AWD2]. These 1977 limitations include: 1979 1. The well known issues with pure SPF protocols, which do not take 1980 network constraints and traffic characteristics into account 1981 during route selection. For example, since IGPs always use the 1982 shortest paths (based on administratively assigned link metrics) 1983 to forward traffic, load sharing cannot be accomplished among 1984 paths of different costs. Using shortest paths to forward 1985 traffic conserves network resources, but may cause the following 1986 problems: 1) If traffic from a source to a destination exceeds 1987 the capacity of a link along the shortest path, the link (hence 1988 the shortest path) becomes congested while a longer path between 1989 these two nodes may be under-utilized; 2) the shortest paths from 1990 different sources can overlap at some links. If the total 1991 traffic from the sources exceeds the capacity of any of these 1992 links, congestion will occur. Problems can also occur because 1993 traffic demand changes over time but network topology and routing 1994 configuration cannot be changed as rapidly. This causes the 1995 network topology and routing configuration to become sub-optimal 1996 over time, which may result in persistent congestion problems. 1998 2. The Equal-Cost Multi-Path (ECMP) capability of SPF IGPs supports 1999 sharing of traffic among equal cost paths between two nodes. 2000 However, ECMP attempts to divide the traffic as equally as 2001 possible among the equal cost shortest paths. Generally, ECMP 2002 does not support configurable load sharing ratios among equal 2003 cost paths. The result is that one of the paths may carry 2004 significantly more traffic than other paths because it may also 2005 carry traffic from other sources. This situation can result in 2006 congestion along the path that carries more traffic. 2008 3. Modifying IGP metrics to control traffic routing tends to have 2009 network-wide effect. Consequently, undesirable and unanticipated 2010 traffic shifts can be triggered as a result. Recent work 2011 described in Section 8 may be capable of better control [FT00], 2012 [FT01]. 2014 Because of these limitations, new capabilities are needed to enhance 2015 the routing function in IP networks. Some of these capabilities have 2016 been described elsewhere and are summarized below. 2018 Constraint-based routing is desirable to evolve the routing 2019 architecture of IP networks, especially public IP backbones with 2020 complex topologies [RFC2702]. Constraint-based routing computes 2021 routes to fulfill requirements subject to constraints. Constraints 2022 may include bandwidth, hop count, delay, and administrative policy 2023 instruments such as resource class attributes [RFC2702], [RFC2386]. 2024 This makes it possible to select routes that satisfy a given set of 2025 requirements subject to network and administrative policy 2026 constraints. Routes computed through constraint-based routing are 2027 not necessarily the shortest paths. Constraint-based routing works 2028 best with path oriented technologies that support explicit routing, 2029 such as MPLS. 2031 Constraint-based routing can also be used as a way to redistribute 2032 traffic onto the infrastructure (even for best effort traffic). For 2033 example, if the bandwidth requirements for path selection and 2034 reservable bandwidth attributes of network links are appropriately 2035 defined and configured, then congestion problems caused by uneven 2036 traffic distribution may be avoided or reduced. In this way, the 2037 performance and efficiency of the network can be improved. 2039 A number of enhancements are needed to conventional link state IGPs, 2040 such as OSPF and IS-IS, to allow them to distribute additional state 2041 information required for constraint-based routing. These extensions 2042 to OSPF were described in [RFC3630] and to IS-IS in [RFC5305]. 2043 Essentially, these enhancements require the propagation of additional 2044 information in link state advertisements. Specifically, in addition 2045 to normal link-state information, an enhanced IGP is required to 2046 propagate topology state information needed for constraint-based 2047 routing. Some of the additional topology state information include 2048 link attributes such as reservable bandwidth and link resource class 2049 attribute (an administratively specified property of the link). The 2050 resource class attribute concept was defined in [RFC2702]. The 2051 additional topology state information is carried in new TLVs and sub- 2052 TLVs in IS-IS, or in the Opaque LSA in OSPF [RFC5305], [RFC3630]. 2054 An enhanced link-state IGP may flood information more frequently than 2055 a normal IGP. This is because even without changes in topology, 2056 changes in reservable bandwidth or link affinity can trigger the 2057 enhanced IGP to initiate flooding. A tradeoff is typically required 2058 between the timeliness of the information flooded and the flooding 2059 frequency to avoid excessive consumption of link bandwidth and 2060 computational resources, and more importantly, to avoid instability. 2062 In a TE system, it is also desirable for the routing subsystem to 2063 make the load splitting ratio among multiple paths (with equal cost 2064 or different cost) configurable. This capability gives network 2065 administrators more flexibility in the control of traffic 2066 distribution across the network. It can be very useful for avoiding/ 2067 relieving congestion in certain situations. Examples can be found in 2068 [XIAO]. 2070 The routing system should also have the capability to control the 2071 routes of subsets of traffic without affecting the routes of other 2072 traffic if sufficient resources exist for this purpose. This 2073 capability allows a more refined control over the distribution of 2074 traffic across the network. For example, the ability to move traffic 2075 from a source to a destination away from its original path to another 2076 path (without affecting other traffic paths) allows traffic to be 2077 moved from resource-poor network segments to resource-rich segments. 2078 Path oriented technologies such as MPLS inherently support this 2079 capability as discussed in [AWD2]. 2081 Additionally, the routing subsystem should be able to select 2082 different paths for different classes of traffic (or for different 2083 traffic behavior aggregates) if the network supports multiple classes 2084 of service (different behavior aggregates). 2086 6.3. Traffic Mapping Recommendations 2088 Traffic mapping pertains to the assignment of traffic workload onto 2089 pre-established paths to meet certain requirements. Thus, while 2090 constraint-based routing deals with path selection, traffic mapping 2091 deals with the assignment of traffic to established paths which may 2092 have been selected by constraint-based routing or by some other 2093 means. Traffic mapping can be performed by time-dependent or state- 2094 dependent mechanisms, as described in Section 5.1. 2096 An important aspect of the traffic mapping function is the ability to 2097 establish multiple paths between an originating node and a 2098 destination node, and the capability to distribute the traffic 2099 between the two nodes across the paths according to some policies. A 2100 pre-condition for this scheme is the existence of flexible mechanisms 2101 to partition traffic and then assign the traffic partitions onto the 2102 parallel paths. This requirement was noted in [RFC2702]. When 2103 traffic is assigned to multiple parallel paths, it is recommended 2104 that special care should be taken to ensure proper ordering of 2105 packets belonging to the same application (or micro-flow) at the 2106 destination node of the parallel paths. 2108 As a general rule, mechanisms that perform the traffic mapping 2109 functions should aim to map the traffic onto the network 2110 infrastructure to minimize congestion. If the total traffic load 2111 cannot be accommodated, or if the routing and mapping functions 2112 cannot react fast enough to changing traffic conditions, then a 2113 traffic mapping system may rely on short time scale congestion 2114 control mechanisms (such as queue management, scheduling, etc.) to 2115 mitigate congestion. Thus, mechanisms that perform the traffic 2116 mapping functions should complement existing congestion control 2117 mechanisms. In an operational network, it is generally desirable to 2118 map the traffic onto the infrastructure such that intra-class and 2119 inter-class resource contention are minimized. 2121 When traffic mapping techniques that depend on dynamic state feedback 2122 (e.g., MATE and such like) are used, special care must be taken to 2123 guarantee network stability. 2125 6.4. Measurement Recommendations 2127 The importance of measurement in traffic engineering has been 2128 discussed throughout this document. Mechanisms should be provided to 2129 measure and collect statistics from the network to support the 2130 traffic engineering function. Additional capabilities may be needed 2131 to help in the analysis of the statistics. The actions of these 2132 mechanisms should not adversely affect the accuracy and integrity of 2133 the statistics collected. The mechanisms for statistical data 2134 acquisition should also be able to scale as the network evolves. 2136 Traffic statistics may be classified according to long-term or short- 2137 term timescales. Long-term timescale traffic statistics are very 2138 useful for traffic engineering. Long-term time scale traffic 2139 statistics may capture or reflect periodicity in network workload 2140 (such as hourly, daily, and weekly variations in traffic profiles) as 2141 well as traffic trends. Aspects of the monitored traffic statistics 2142 may also depict class of service characteristics for a network 2143 supporting multiple classes of service. Analysis of the long-term 2144 traffic statistics may yield secondary statistics such as busy hour 2145 characteristics, traffic growth patterns, persistent congestion 2146 problems, hot-spot, and imbalances in link utilization caused by 2147 routing anomalies. 2149 A mechanism for constructing traffic matrices for both long-term and 2150 short-term traffic statistics should be in place. In multi-service 2151 IP networks, the traffic matrices may be constructed for different 2152 service classes. Each element of a traffic matrix represents a 2153 statistic of traffic flow between a pair of abstract nodes. An 2154 abstract node may represent a router, a collection of routers, or a 2155 site in a VPN. 2157 Measured traffic statistics should provide reasonable and reliable 2158 indicators of the current state of the network on the short-term 2159 scale. Some short term traffic statistics may reflect link 2160 utilization and link congestion status. Examples of congestion 2161 indicators include excessive packet delay, packet loss, and high 2162 resource utilization. Examples of mechanisms for distributing this 2163 kind of information include SNMP, probing techniques, FTP, IGP link 2164 state advertisements, etc. 2166 6.5. Network Survivability 2168 Network survivability refers to the capability of a network to 2169 maintain service continuity in the presence of faults. This can be 2170 accomplished by promptly recovering from network impairments and 2171 maintaining the required QoS for existing services after recovery. 2172 Survivability has become an issue of great concern within the 2173 Internet community due to the increasing demands to carry mission 2174 critical traffic, real-time traffic, and other high priority traffic 2175 over the Internet. Survivability can be addressed at the device 2176 level by developing network elements that are more reliable; and at 2177 the network level by incorporating redundancy into the architecture, 2178 design, and operation of networks. It is recommended that a 2179 philosophy of robustness and survivability should be adopted in the 2180 architecture, design, and operation of traffic engineering that 2181 control IP networks (especially public IP networks). Because 2182 different contexts may demand different levels of survivability, the 2183 mechanisms developed to support network survivability should be 2184 flexible so that they can be tailored to different needs. A number 2185 of tools and techniques have been developed to enable network 2186 survivability including MPLS Fast Reroute [RFC4090], RSVP-TE 2187 Extensions in Support of End-to-End Generalized Multi-Protocol Label 2188 Switching (GMPLS) Recovery [RFC4872], and GMPLS Segment Recovery 2189 [RFC4873]. 2191 Failure protection and restoration capabilities have become available 2192 from multiple layers as network technologies have continued to 2193 improve. At the bottom of the layered stack, optical networks are 2194 now capable of providing dynamic ring and mesh restoration 2195 functionality at the wavelength level as well as traditional 2196 protection functionality. At the SONET/SDH layer survivability 2197 capability is provided with Automatic Protection Switching (APS) as 2198 well as self-healing ring and mesh architectures. Similar 2199 functionality is provided by layer 2 technologies such as ATM 2200 (generally with slower mean restoration times). Rerouting is 2201 traditionally used at the IP layer to restore service following link 2202 and node outages. Rerouting at the IP layer occurs after a period of 2203 routing convergence which may require seconds to minutes to complete. 2204 Some new developments in the MPLS context make it possible to achieve 2205 recovery at the IP layer prior to convergence [RFC3469]. 2207 To support advanced survivability requirements, path-oriented 2208 technologies such a MPLS can be used to enhance the survivability of 2209 IP networks in a potentially cost effective manner. The advantages 2210 of path oriented technologies such as MPLS for IP restoration becomes 2211 even more evident when class based protection and restoration 2212 capabilities are required. 2214 Recently, a common suite of control plane protocols has been proposed 2215 for both MPLS and optical transport networks under the acronym Multi- 2216 protocol Lambda Switching [AWD1]. This new paradigm of Multi- 2217 protocol Lambda Switching will support even more sophisticated mesh 2218 restoration capabilities at the optical layer for the emerging IP 2219 over WDM network architectures. 2221 Another important aspect regarding multi-layer survivability is that 2222 technologies at different layers provide protection and restoration 2223 capabilities at different temporal granularities (in terms of time 2224 scales) and at different bandwidth granularity (from packet-level to 2225 wavelength level). Protection and restoration capabilities can also 2226 be sensitive to different service classes and different network 2227 utility models. 2229 The impact of service outages varies significantly for different 2230 service classes depending upon the effective duration of the outage. 2231 The duration of an outage can vary from milliseconds (with minor 2232 service impact) to seconds (with possible call drops for IP telephony 2233 and session time-outs for connection oriented transactions) to 2234 minutes and hours (with potentially considerable social and business 2235 impact). 2237 Coordinating different protection and restoration capabilities across 2238 multiple layers in a cohesive manner to ensure network survivability 2239 is maintained at reasonable cost is a challenging task. Protection 2240 and restoration coordination across layers may not always be 2241 feasible, because networks at different layers may belong to 2242 different administrative domains. 2244 The following paragraphs present some of the general recommendations 2245 for protection and restoration coordination. 2247 o Protection and restoration capabilities from different layers 2248 should be coordinated whenever feasible and appropriate to provide 2249 network survivability in a flexible and cost effective manner. 2250 Minimization of function duplication across layers is one way to 2251 achieve the coordination. Escalation of alarms and other fault 2252 indicators from lower to higher layers may also be performed in a 2253 coordinated manner. A temporal order of restoration trigger 2254 timing at different layers is another way to coordinate multi- 2255 layer protection/restoration. 2257 o Spare capacity at higher layers is often regarded as working 2258 traffic at lower layers. Placing protection/restoration functions 2259 in many layers may increase redundancy and robustness, but it 2260 should not result in significant and avoidable inefficiencies in 2261 network resource utilization. 2263 o It is generally desirable to have protection and restoration 2264 schemes that are bandwidth efficient. 2266 o Failure notification throughout the network should be timely and 2267 reliable. 2269 o Alarms and other fault monitoring and reporting capabilities 2270 should be provided at appropriate layers. 2272 6.5.1. Survivability in MPLS Based Networks 2274 MPLS is an important emerging technology that enhances IP networks in 2275 terms of features, capabilities, and services. Because MPLS is path- 2276 oriented, it can potentially provide faster and more predictable 2277 protection and restoration capabilities than conventional hop by hop 2278 routed IP systems. This subsection describes some of the basic 2279 aspects and recommendations for MPLS networks regarding protection 2280 and restoration. See [RFC3469] for a more comprehensive discussion 2281 on MPLS based recovery. 2283 Protection types for MPLS networks can be categorized as link 2284 protection, node protection, path protection, and segment protection. 2286 o Link Protection: The objective for link protection is to protect 2287 an LSP from a given link failure. Under link protection, the path 2288 of the protection or backup LSP (the secondary LSP) is disjoint 2289 from the path of the working or operational LSP at the particular 2290 link over which protection is required. When the protected link 2291 fails, traffic on the working LSP is switched over to the 2292 protection LSP at the head-end of the failed link. This is a 2293 local repair method which can be fast. It might be more 2294 appropriate in situations where some network elements along a 2295 given path are less reliable than others. 2297 o Node Protection: The objective of LSP node protection is to 2298 protect an LSP from a given node failure. Under node protection, 2299 the path of the protection LSP is disjoint from the path of the 2300 working LSP at the particular node to be protected. The secondary 2301 path is also disjoint from the primary path at all links 2302 associated with the node to be protected. When the node fails, 2303 traffic on the working LSP is switched over to the protection LSP 2304 at the upstream LSR directly connected to the failed node. 2306 o Path Protection: The goal of LSP path protection is to protect an 2307 LSP from failure at any point along its routed path. Under path 2308 protection, the path of the protection LSP is completely disjoint 2309 from the path of the working LSP. The advantage of path 2310 protection is that the backup LSP protects the working LSP from 2311 all possible link and node failures along the path, except for 2312 failures that might occur at the ingress and egress LSRs, or for 2313 correlated failures that might impact both working and backup 2314 paths simultaneously. Additionally, since the path selection is 2315 end-to-end, path protection might be more efficient in terms of 2316 resource usage than link or node protection. However, path 2317 protection may be slower than link and node protection in general. 2319 o Segment Protection: An MPLS domain may be partitioned into 2320 multiple protection domains whereby a failure in a protection 2321 domain is rectified within that domain. In cases where an LSP 2322 traverses multiple protection domains, a protection mechanism 2323 within a domain only needs to protect the segment of the LSP that 2324 lies within the domain. Segment protection will generally be 2325 faster than path protection because recovery generally occurs 2326 closer to the fault. 2328 6.5.2. Protection Option 2330 Another issue to consider is the concept of protection options. The 2331 protection option uses the notation m:n protection, where m is the 2332 number of protection LSPs used to protect n working LSPs. Feasible 2333 protection options follow. 2335 o 1:1: one working LSP is protected/restored by one protection LSP. 2337 o 1:n: one protection LSP is used to protect/restore n working LSPs. 2339 o n:1: one working LSP is protected/restored by n protection LSPs, 2340 possibly with configurable load splitting ratio. When more than 2341 one protection LSP is used, it may be desirable to share the 2342 traffic across the protection LSPs when the working LSP fails to 2343 satisfy the bandwidth requirement of the traffic trunk associated 2344 with the working LSP. This may be especially useful when it is 2345 not feasible to find one path that can satisfy the bandwidth 2346 requirement of the primary LSP. 2348 o 1+1: traffic is sent concurrently on both the working LSP and the 2349 protection LSP. In this case, the egress LSR selects one of the 2350 two LSPs based on a local traffic integrity decision process, 2351 which compares the traffic received from both the working and the 2352 protection LSP and identifies discrepancies. It is unlikely that 2353 this option would be used extensively in IP networks due to its 2354 resource utilization inefficiency. However, if bandwidth becomes 2355 plentiful and cheap, then this option might become quite viable 2356 and attractive in IP networks. 2358 6.6. Traffic Engineering in Diffserv Environments 2360 This section provides an overview of the traffic engineering features 2361 and recommendations that are specifically pertinent to Differentiated 2362 Services (Diffserv) [RFC2475] capable IP networks. 2364 Increasing requirements to support multiple classes of traffic, such 2365 as best effort and mission critical data, in the Internet calls for 2366 IP networks to differentiate traffic according to some criteria, and 2367 to accord preferential treatment to certain types of traffic. Large 2368 numbers of flows can be aggregated into a few behavior aggregates 2369 based on some criteria in terms of common performance requirements in 2370 terms of packet loss ratio, delay, and jitter; or in terms of common 2371 fields within the IP packet headers. 2373 As Diffserv evolves and becomes deployed in operational networks, 2374 traffic engineering will be critical to ensuring that SLAs defined 2375 within a given Diffserv service model are met. Classes of service 2376 (CoS) can be supported in a Diffserv environment by concatenating 2377 per-hop behaviors (PHBs) along the routing path, using service 2378 provisioning mechanisms, and by appropriately configuring edge 2379 functionality such as traffic classification, marking, policing, and 2380 shaping. PHB is the forwarding behavior that a packet receives at a 2381 DS node (a Diffserv-compliant node). This is accomplished by means 2382 of buffer management and packet scheduling mechanisms. In this 2383 context, packets belonging to a class are those that are members of a 2384 corresponding ordering aggregate. 2386 Traffic engineering can be used as a compliment to Diffserv 2387 mechanisms to improve utilization of network resources, but not as a 2388 necessary element in general. When traffic engineering is used, it 2389 can be operated on an aggregated basis across all service classes 2391 [RFC3270] or on a per service class basis. The former is used to 2392 provide better distribution of the aggregate traffic load over the 2393 network resources. (See [RFC3270] for detailed mechanisms to support 2394 aggregate traffic engineering.) The latter case is discussed below 2395 since it is specific to the Diffserv environment, with so called 2396 Diffserv-aware traffic engineering [RFC4124]. 2398 For some Diffserv networks, it may be desirable to control the 2399 performance of some service classes by enforcing certain 2400 relationships between the traffic workload contributed by each 2401 service class and the amount of network resources allocated or 2402 provisioned for that service class. Such relationships between 2403 demand and resource allocation can be enforced using a combination 2404 of, for example: 2406 o traffic engineering mechanisms on a per service class basis that 2407 enforce the desired relationship between the amount of traffic 2408 contributed by a given service class and the resources allocated 2409 to that class 2411 o mechanisms that dynamically adjust the resources allocated to a 2412 given service class to relate to the amount of traffic contributed 2413 by that service class. 2415 It may also be desirable to limit the performance impact of high 2416 priority traffic on relatively low priority traffic. This can be 2417 achieved by, for example, controlling the percentage of high priority 2418 traffic that is routed through a given link. Another way to 2419 accomplish this is to increase link capacities appropriately so that 2420 lower priority traffic can still enjoy adequate service quality. 2421 When the ratio of traffic workload contributed by different service 2422 classes vary significantly from router to router, it may not suffice 2423 to rely exclusively on conventional IGP routing protocols or on 2424 traffic engineering mechanisms that are insensitive to different 2425 service classes. Instead, it may be desirable to perform traffic 2426 engineering, especially routing control and mapping functions, on a 2427 per service class basis. One way to accomplish this in a domain that 2428 supports both MPLS and Diffserv is to define class specific LSPs and 2429 to map traffic from each class onto one or more LSPs that correspond 2430 to that service class. An LSP corresponding to a given service class 2431 can then be routed and protected/restored in a class dependent 2432 manner, according to specific policies. 2434 Performing traffic engineering on a per class basis may require 2435 certain per-class parameters to be distributed. Note that it is 2436 common to have some classes share some aggregate constraint (e.g., 2437 maximum bandwidth requirement) without enforcing the constraint on 2438 each individual class. These classes then can be grouped into a 2439 class-type and per-class-type parameters can be distributed instead 2440 to improve scalability. It also allows better bandwidth sharing 2441 between classes in the same class-type. A class-type is a set of 2442 classes that satisfy the following two conditions: 2444 o Classes in the same class-type have common aggregate requirements 2445 to satisfy required performance levels. 2447 o There is no requirement to be enforced at the level of individual 2448 class in the class-type. Note that it is still possible, 2449 nevertheless, to implement some priority policies for classes in 2450 the same class-type to permit preferential access to the class- 2451 type bandwidth through the use of preemption priorities. 2453 An example of the class-type can be a low-loss class-type that 2454 includes both AF1-based and AF2-based Ordering Aggregates. With such 2455 a class-type, one may implement some priority policy which assigns 2456 higher preemption priority to AF1-based traffic trunks over AF2-based 2457 ones, vice versa, or the same priority. 2459 See [RFC4124] for detailed requirements on Diffserv-aware traffic 2460 engineering. 2462 6.7. Network Controllability 2464 Off-line (and on-line) traffic engineering considerations would be of 2465 limited utility if the network could not be controlled effectively to 2466 implement the results of TE decisions and to achieve desired network 2467 performance objectives. Capacity augmentation is a coarse grained 2468 solution to traffic engineering issues. However, it is simple and 2469 may be advantageous if bandwidth is abundant and cheap or if the 2470 current or expected network workload demands it. However, bandwidth 2471 is not always abundant and cheap, and the workload may not always 2472 demand additional capacity. Adjustments of administrative weights 2473 and other parameters associated with routing protocols provide finer 2474 grained control, but is difficult to use and imprecise because of the 2475 routing interactions that occur across the network. In certain 2476 network contexts, more flexible, finer grained approaches which 2477 provide more precise control over the mapping of traffic to routes 2478 and over the selection and placement of routes may be appropriate and 2479 useful. 2481 Control mechanisms can be manual (e.g., administrative 2482 configuration), partially-automated (e.g., scripts) or fully- 2483 automated (e.g., policy based management systems). Automated 2484 mechanisms are particularly required in large scale networks. Multi- 2485 vendor interoperability can be facilitated by developing and 2486 deploying standardized management systems (e.g., standard MIBs) and 2487 policies (PIBs) to support the control functions required to address 2488 traffic engineering objectives such as load distribution and 2489 protection/restoration. 2491 Network control functions should be secure, reliable, and stable as 2492 these are often needed to operate correctly in times of network 2493 impairments (e.g., during network congestion or security attacks). 2495 7. Inter-Domain Considerations 2497 Inter-domain traffic engineering is concerned with the performance 2498 optimization for traffic that originates in one administrative domain 2499 and terminates in a different one. 2501 Traffic exchange between autonomous systems in the Internet occurs 2502 through exterior gateway protocols. Currently, BGP [RFC4271] is the 2503 standard exterior gateway protocol for the Internet. BGP provides a 2504 number of attributes and capabilities (e.g., route filtering) that 2505 can be used for inter-domain traffic engineering. More specifically, 2506 BGP permits the control of routing information and traffic exchange 2507 between Autonomous Systems (ASes) in the Internet. BGP incorporates 2508 a sequential decision process which calculates the degree of 2509 preference for various routes to a given destination network. There 2510 are two fundamental aspects to inter-domain traffic engineering using 2511 BGP: 2513 o Route Redistribution: controlling the import and export of routes 2514 between AS's, and controlling the redistribution of routes between 2515 BGP and other protocols within an AS. 2517 o Best path selection: selecting the best path when there are 2518 multiple candidate paths to a given destination network. Best 2519 path selection is performed by the BGP decision process based on a 2520 sequential procedure, taking a number of different considerations 2521 into account. Ultimately, best path selection under BGP boils 2522 down to selecting preferred exit points out of an AS towards 2523 specific destination networks. The BGP path selection process can 2524 be influenced by manipulating the attributes associated with the 2525 BGP decision process. These attributes include: NEXT-HOP, WEIGHT 2526 (Cisco proprietary which is also implemented by some other 2527 vendors), LOCAL-PREFERENCE, AS-PATH, ROUTE-ORIGIN, MULTI-EXIT- 2528 DESCRIMINATOR (MED), IGP METRIC, etc. 2530 Route-maps provide the flexibility to implement complex BGP policies 2531 based on pre-configured logical conditions. In particular, Route- 2532 maps can be used to control import and export policies for incoming 2533 and outgoing routes, control the redistribution of routes between BGP 2534 and other protocols, and influence the selection of best paths by 2535 manipulating the attributes associated with the BGP decision process. 2536 Very complex logical expressions that implement various types of 2537 policies can be implemented using a combination of Route-maps, BGP- 2538 attributes, Access-lists, and Community attributes. 2540 When looking at possible strategies for inter-domain TE with BGP, it 2541 must be noted that the outbound traffic exit point is controllable, 2542 whereas the interconnection point where inbound traffic is received 2543 from an EBGP peer typically is not, unless a special arrangement is 2544 made with the peer sending the traffic. Therefore, it is up to each 2545 individual network to implement sound TE strategies that deal with 2546 the efficient delivery of outbound traffic from one's customers to 2547 one's peering points. The vast majority of TE policy is based upon a 2548 "closest exit" strategy, which offloads interdomain traffic at the 2549 nearest outbound peer point towards the destination autonomous 2550 system. Most methods of manipulating the point at which inbound 2551 traffic enters a network from an EBGP peer (inconsistent route 2552 announcements between peering points, AS pre-pending, and sending 2553 MEDs) are either ineffective, or not accepted in the peering 2554 community. 2556 Inter-domain TE with BGP is generally effective, but it is usually 2557 applied in a trial-and-error fashion. A systematic approach for 2558 inter-domain traffic engineering is yet to be devised. 2560 Inter-domain TE is inherently more difficult than intra-domain TE 2561 under the current Internet architecture. The reasons for this are 2562 both technical and administrative. Technically, while topology and 2563 link state information are helpful for mapping traffic more 2564 effectively, BGP does not propagate such information across domain 2565 boundaries for stability and scalability reasons. Administratively, 2566 there are differences in operating costs and network capacities 2567 between domains. Generally, what may be considered a good solution 2568 in one domain may not necessarily be a good solution in another 2569 domain. Moreover, it would generally be considered inadvisable for 2570 one domain to permit another domain to influence the routing and 2571 management of traffic in its network. 2573 MPLS TE-tunnels (explicit LSPs) can potentially add a degree of 2574 flexibility in the selection of exit points for inter-domain routing. 2575 The concept of relative and absolute metrics can be applied to this 2576 purpose. The idea is that if BGP attributes are defined such that 2577 the BGP decision process depends on IGP metrics to select exit points 2578 for inter-domain traffic, then some inter-domain traffic destined to 2579 a given peer network can be made to prefer a specific exit point by 2580 establishing a TE-tunnel between the router making the selection to 2581 the peering point via a TE-tunnel and assigning the TE-tunnel a 2582 metric which is smaller than the IGP cost to all other peering 2583 points. If a peer accepts and processes MEDs, then a similar MPLS 2584 TE-tunnel based scheme can be applied to cause certain entrance 2585 points to be preferred by setting MED to be an IGP cost, which has 2586 been modified by the tunnel metric. 2588 Similar to intra-domain TE, inter-domain TE is best accomplished when 2589 a traffic matrix can be derived to depict the volume of traffic from 2590 one autonomous system to another. 2592 Generally, redistribution of inter-domain traffic requires 2593 coordination between peering partners. An export policy in one 2594 domain that results in load redistribution across peer points with 2595 another domain can significantly affect the local traffic matrix 2596 inside the domain of the peering partner. This, in turn, will affect 2597 the intra-domain TE due to changes in the spatial distribution of 2598 traffic. Therefore, it is mutually beneficial for peering partners 2599 to coordinate with each other before attempting any policy changes 2600 that may result in significant shifts in inter-domain traffic. In 2601 certain contexts, this coordination can be quite challenging due to 2602 technical and non- technical reasons. 2604 It is a matter of speculation as to whether MPLS, or similar 2605 technologies, can be extended to allow selection of constrained paths 2606 across domain boundaries. 2608 8. Overview of Contemporary TE Practices in Operational IP Networks 2610 This section provides an overview of some contemporary traffic 2611 engineering practices in IP networks. The focus is primarily on the 2612 aspects that pertain to the control of the routing function in 2613 operational contexts. The intent here is to provide an overview of 2614 the commonly used practices. The discussion is not intended to be 2615 exhaustive. 2617 Currently, service providers apply many of the traffic engineering 2618 mechanisms discussed in this document to optimize the performance of 2619 their IP networks. These techniques include capacity planning for 2620 long timescales, routing control using IGP metrics and MPLS for 2621 medium timescales, the overlay model also for medium timescales, and 2622 traffic management mechanisms for short timescale. 2624 When a service provider plans to build an IP network, or expand the 2625 capacity of an existing network, effective capacity planning should 2626 be an important component of the process. Such plans may take the 2627 following aspects into account: location of new nodes if any, 2628 existing and predicted traffic patterns, costs, link capacity, 2629 topology, routing design, and survivability. 2631 Performance optimization of operational networks is usually an 2632 ongoing process in which traffic statistics, performance parameters, 2633 and fault indicators are continually collected from the network. 2634 This empirical data is then analyzed and used to trigger various 2635 traffic engineering mechanisms. Tools that perform what-if analysis 2636 can also be used to assist the TE process by allowing various 2637 scenarios to be reviewed before a new set of configurations are 2638 implemented in the operational network. 2640 Traditionally, intra-domain real-time TE with IGP is done by 2641 increasing the OSPF or IS-IS metric of a congested link until enough 2642 traffic has been diverted from that link. This approach has some 2643 limitations as discussed in Section 6.2. Recently, some new intra- 2644 domain TE approaches/tools have been proposed [RR94] [FT00] [FT01] 2645 [WANG]. Such approaches/tools take traffic matrix, network topology, 2646 and network performance objectives as input, and produce some link 2647 metrics and possibly some unequal load-sharing ratios to be set at 2648 the head-end routers of some ECMPs as output. These new progresses 2649 open new possibility for intra-domain TE with IGP to be done in a 2650 more systematic way. 2652 The overlay model (IP over ATM, or IP over Frame Relay) is another 2653 approach which was commonly used [AWD2], but has been replaced by 2654 MPLS and router hardware technology. 2656 Deployment of MPLS for traffic engineering applications has commenced 2657 in some service provider networks. One operational scenario is to 2658 deploy MPLS in conjunction with an IGP (IS-IS-TE or OSPF-TE) that 2659 supports the traffic engineering extensions, in conjunction with 2660 constraint-based routing for explicit route computations, and a 2661 signaling protocol (e.g., RSVP-TE) for LSP instantiation. 2663 In contemporary MPLS traffic engineering contexts, network 2664 administrators specify and configure link attributes and resource 2665 constraints such as maximum reservable bandwidth and resource class 2666 attributes for links (interfaces) within the MPLS domain. A link 2667 state protocol that supports TE extensions (IS-IS-TE or OSPF-TE) is 2668 used to propagate information about network topology and link 2669 attribute to all routers in the routing area. Network administrators 2670 also specify all the LSPs that are to originate each router. For 2671 each LSP, the network administrator specifies the destination node 2672 and the attributes of the LSP which indicate the requirements that to 2673 be satisfied during the path selection process. Each router then 2674 uses a local constraint-based routing process to compute explicit 2675 paths for all LSPs originating from it. Subsequently, a signaling 2676 protocol is used to instantiate the LSPs. By assigning proper 2677 bandwidth values to links and LSPs, congestion caused by uneven 2678 traffic distribution can generally be avoided or mitigated. 2680 The bandwidth attributes of LSPs used for traffic engineering can be 2681 updated periodically. The basic concept is that the bandwidth 2682 assigned to an LSP should relate in some manner to the bandwidth 2683 requirements of traffic that actually flows through the LSP. The 2684 traffic attribute of an LSP can be modified to accommodate traffic 2685 growth and persistent traffic shifts. If network congestion occurs 2686 due to some unexpected events, existing LSPs can be rerouted to 2687 alleviate the situation or network administrator can configure new 2688 LSPs to divert some traffic to alternative paths. The reservable 2689 bandwidth of the congested links can also be reduced to force some 2690 LSPs to be rerouted to other paths. 2692 In an MPLS domain, a traffic matrix can also be estimated by 2693 monitoring the traffic on LSPs. Such traffic statistics can be used 2694 for a variety of purposes including network planning and network 2695 optimization. Current practice suggests that deploying an MPLS 2696 network consisting of hundreds of routers and thousands of LSPs is 2697 feasible. In summary, recent deployment experience suggests that 2698 MPLS approach is very effective for traffic engineering in IP 2699 networks [XIAO]. 2701 As mentioned previously in Section 7, one usually has no direct 2702 control over the distribution of inbound traffic. Therefore, the 2703 main goal of contemporary inter-domain TE is to optimize the 2704 distribution of outbound traffic between multiple inter-domain links. 2705 When operating a global network, maintaining the ability to operate 2706 the network in a regional fashion where desired, while continuing to 2707 take advantage of the benefits of a global network, also becomes an 2708 important objective. 2710 Inter-domain TE with BGP usually begins with the placement of 2711 multiple peering interconnection points in locations that have high 2712 peer density, are in close proximity to originating/terminating 2713 traffic locations on one's own network, and are lowest in cost. 2714 There are generally several locations in each region of the world 2715 where the vast majority of major networks congregate and 2716 interconnect. Some location-decision problems that arise in 2717 association with inter-domain routing are discussed in [AWD5]. 2719 Once the locations of the interconnects are determined, and circuits 2720 are implemented, one decides how best to handle the routes heard from 2721 the peer, as well as how to propagate the peers' routes within one's 2722 own network. One way to engineer outbound traffic flows on a network 2723 with many EBGP peers is to create a hierarchy of peers. Generally, 2724 the Local Preferences of all peers are set to the same value so that 2725 the shortest AS paths will be chosen to forward traffic. Then, by 2726 over-writing the inbound MED metric (Multi-exit-discriminator metric, 2727 also referred to as "BGP metric". Both terms are used 2728 interchangeably in this document) with BGP metrics to routes received 2729 at different peers, the hierarchy can be formed. For example, all 2730 Local Preferences can be set to 200, preferred private peers can be 2731 assigned a BGP metric of 50, the rest of the private peers can be 2732 assigned a BGP metric of 100, and public peers can be assigned a BGP 2733 metric of 600. "Preferred" peers might be defined as those peers 2734 with whom the most available capacity exists, whose customer base is 2735 larger in comparison to other peers, whose interconnection costs are 2736 the lowest, and with whom upgrading existing capacity is the easiest. 2737 In a network with low utilization at the edge, this works well. The 2738 same concept could be applied to a network with higher edge 2739 utilization by creating more levels of BGP metrics between peers, 2740 allowing for more granularity in selecting the exit points for 2741 traffic bound for a dual homed customer on a peer's network. 2743 By only replacing inbound MED metrics with BGP metrics, only equal 2744 AS-Path length routes' exit points are being changed. (The BGP 2745 decision considers Local Preference first, then AS-Path length, and 2746 then BGP metric). For example, assume a network has two possible 2747 egress points, peer A and peer B. Each peer has 40% of the 2748 Internet's routes exclusively on its network, while the remaining 20% 2749 of the Internet's routes are from customers who dual home between A 2750 and B. Assume that both peers have a Local Preference of 200 and a 2751 BGP metric of 100. If the link to peer A is congested, increasing 2752 its BGP metric while leaving the Local Preference at 200 will ensure 2753 that the 20% of total routes belonging to dual homed customers will 2754 prefer peer B as the exit point. The previous example would be used 2755 in a situation where all exit points to a given peer were close to 2756 congestion levels, and traffic needed to be shifted away from that 2757 peer entirely. 2759 When there are multiple exit points to a given peer, and only one of 2760 them is congested, it is not necessary to shift traffic away from the 2761 peer entirely, but only from the one congested circuit. This can be 2762 achieved by using passive IGP-metrics, AS-path filtering, or prefix 2763 filtering. 2765 Occasionally, more drastic changes are needed, for example, in 2766 dealing with a "problem peer" who is difficult to work with on 2767 upgrades or is charging high prices for connectivity to their 2768 network. In that case, the Local Preference to that peer can be 2769 reduced below the level of other peers. This effectively reduces the 2770 amount of traffic sent to that peer to only originating traffic 2771 (assuming no transit providers are involved). This type of change 2772 can affect a large amount of traffic, and is only used after other 2773 methods have failed to provide the desired results. 2775 Although it is not much of an issue in regional networks, the 2776 propagation of a peer's routes back through the network must be 2777 considered when a network is peering on a global scale. Sometimes, 2778 business considerations can influence the choice of BGP policies in a 2779 given context. For example, it may be imprudent, from a business 2780 perspective, to operate a global network and provide full access to 2781 the global customer base to a small network in a particular country. 2782 However, for the purpose of providing one's own customers with 2783 quality service in a particular region, good connectivity to that in- 2784 country network may still be necessary. This can be achieved by 2785 assigning a set of communities at the edge of the network, which have 2786 a known behavior when routes tagged with those communities are 2787 propagating back through the core. Routes heard from local peers 2788 will be prevented from propagating back to the global network, 2789 whereas routes learned from larger peers may be allowed to propagate 2790 freely throughout the entire global network. By implementing a 2791 flexible community strategy, the benefits of using a single global AS 2792 Number (ASN) can be realized, while the benefits of operating 2793 regional networks can also be taken advantage of. An alternative to 2794 doing this is to use different ASNs in different regions, with the 2795 consequence that the AS path length for routes announced by that 2796 service provider will increase. 2798 9. Conclusion 2800 This document described principles for traffic engineering in the 2801 Internet. It presented an overview of some of the basic issues 2802 surrounding traffic engineering in IP networks. The context of TE 2803 was described, a TE process models and a taxonomy of TE styles were 2804 presented. A brief historical review of pertinent developments 2805 related to traffic engineering was provided. A survey of 2806 contemporary TE techniques in operational networks was presented. 2807 Additionally, the document specified a set of generic requirements, 2808 recommendations, and options for Internet traffic engineering. 2810 10. Security Considerations 2812 This document does not introduce new security issues. 2814 11. IANA Considerations 2816 This draft makes no requests for IANA action. 2818 12. Acknowledgments 2820 Much of the text in this document is derived from RFC 3272. The 2821 authors of this document would like to express their gratitude to all 2822 involved in that work. Although the source text has been edited in 2823 the production of this document, the orginal authors should be 2824 considered as Contributors to this work. They were: 2826 Daniel O. Awduche 2827 Movaz Networks 2828 7926 Jones Branch Drive, Suite 615 2829 McLean, VA 22102 2831 Phone: 703-298-5291 2832 EMail: awduche@movaz.com 2834 Angela Chiu 2835 Celion Networks 2836 1 Sheila Dr., Suite 2 2837 Tinton Falls, NJ 07724 2839 Phone: 732-747-9987 2840 EMail: angela.chiu@celion.com 2842 Anwar Elwalid 2843 Lucent Technologies 2844 Murray Hill, NJ 07974 2846 Phone: 908 582-7589 2847 EMail: anwar@lucent.com 2849 Indra Widjaja 2850 Bell Labs, Lucent Technologies 2851 600 Mountain Avenue 2852 Murray Hill, NJ 07974 2854 Phone: 908 582-0435 2855 EMail: iwidjaja@research.bell-labs.com 2857 XiPeng Xiao 2858 Redback Networks 2859 300 Holger Way 2860 San Jose, CA 95134 2862 Phone: 408-750-5217 2863 EMail: xipeng@redback.com 2865 The acknowledgements in RFC3272 were as below. All people who helped 2866 in the production of that document also need to be thanked for the 2867 carry-over into this new document. 2869 The authors would like to thank Jim Boyle for inputs on the 2870 recommendations section, Francois Le Faucheur for inputs on 2871 Diffserv aspects, Blaine Christian for inputs on measurement, 2872 Gerald Ash for inputs on routing in telephone networks and for 2873 text on event-dependent TE methods, Steven Wright for inputs 2874 on network controllability, and Jonathan Aufderheide for 2875 inputs on inter-domain TE with BGP. Special thanks to 2876 Randy Bush for proposing the TE taxonomy based on "tactical vs 2877 strategic" methods. The subsection describing an "Overview of 2878 ITU Activities Related to Traffic Engineering" was adapted from 2879 a contribution by Waisum Lai. Useful feedback and pointers to 2880 relevant materials were provided by J. Noel Chiappa. 2881 Additional comments were provided by Glenn Grotefeld during 2882 the working last call process. Finally, the authors would like 2883 to thank Ed Kern, the TEWG co-chair, for his comments and 2884 support. 2886 The early versions of this document were produced by the TEAS Working 2887 Group's RFC3272bis Design Team. The full list of members of this 2888 team is: 2890 Acee Lindem 2891 Adrian Farrel 2892 Aijun Wang 2893 Daniele Ceccarelli 2894 Dieter Beller 2895 Jeff Tantsura 2896 Julien Meuric 2897 Liu Hua 2898 Loa Andersson 2899 Luis Miguel Contreras 2900 Martin Horneffer 2901 Tarek Saad 2902 Xufeng Liu 2904 The production of this document includes a fix to the original text 2905 resulting from an Errata Report by Jean-Michel Grimaldi. 2907 The authors of this document would also like to thank Dhurv Dhody for 2908 review comments. 2910 13. Contributors 2912 The following people contributed substantive text to this document: 2914 Gert Grammel 2915 EMail: ggrammel@juniper.net 2917 Loa Andersson 2918 EMail: loa@pi.nu 2920 Xufeng Liu 2921 EMail: xufeng.liu.ietf@gmail.com 2923 Lou Berger 2924 EMail: lberger@labn.net 2926 Jeff Tantsura 2927 EMail: jefftant.ietf@gmail.com 2929 14. Informative References 2931 [AJ19] Adekitan, A., Abolade, J., and O. Shobayo, "Data mining 2932 approach for predicting the daily Internet data traffic of 2933 a smart university", Article Journal of Big Data, 2019, 2934 Volume 6, Number 1, Page 1, 1998. 2936 [ASH2] Ash, J., "Dynamic Routing in Telecommunications Networks", 2937 Book McGraw Hill, 1998. 2939 [AWD1] Awduche, D. and Y. Rekhter, "Multiprocotol Lambda 2940 Switching - Combining MPLS Traffic Engineering Control 2941 with Optical Crossconnects", Article IEEE Communications 2942 Magazine, March 2001. 2944 [AWD2] Awduche, D., "MPLS and Traffic Engineering in IP 2945 Networks", Article IEEE Communications Magazine, December 2946 1999. 2948 [AWD5] Awduche, D., "An Approach to Optimal Peering Between 2949 Autonomous Systems in the Internet", Paper International 2950 Conference on Computer Communications and Networks 2951 (ICCCN'98), October 1998. 2953 [FLJA93] Floyd, S. and V. Jacobson, "Random Early Detection 2954 Gateways for Congestion Avoidance", Article IEEE/ACM 2955 Transactions on Networking, Vol. 1, p. 387-413, November 2956 1993. 2958 [FLOY94] Floyd, S., "TCP and Explicit Congestion Notification", 2959 Article ACM Computer Communication Review, V. 24, No. 5, 2960 p. 10-23, October 1994. 2962 [FT00] Fortz, B. and M. Thorup, "Internet Traffic Engineering by 2963 Optimizing OSPF Weights", Article IEEE INFOCOM 2000, March 2964 2000. 2966 [FT01] Fortz, B. and M. Thorup, "Optimizing OSPF/IS-IS Weights in 2967 a Changing World", n.d., 2968 . 2970 [HUSS87] Hurley, B., Seidl, C., and W. Sewel, "A Survey of Dynamic 2971 Routing Methods for Circuit-Switched Traffic", 2972 Article IEEE Communication Magazine, September 1987. 2974 [I-D.ietf-teas-yang-te-topo] 2975 Liu, X., Bryskin, I., Beeram, V., Saad, T., Shah, H., and 2976 O. Dios, "YANG Data Model for Traffic Engineering (TE) 2977 Topologies", draft-ietf-teas-yang-te-topo-22 (work in 2978 progress), June 2019. 2980 [I-D.ietf-tewg-qos-routing] 2981 Ash, G., "Traffic Engineering & QoS Methods for IP-, ATM-, 2982 & Based Multiservice Networks", draft-ietf-tewg-qos- 2983 routing-04 (work in progress), October 2001. 2985 [ITU-E600] 2986 "Terms and Definitions of Traffic Engineering", 2987 Recommendation ITU-T Recommendation E.600, March 1993. 2989 [ITU-E701] 2990 "Reference Connections for Traffic Engineering", 2991 Recommendation ITU-T Recommendation E.701, October 1993. 2993 [ITU-E801] 2994 "Framework for Service Quality Agreement", 2995 Recommendation ITU-T Recommendation E.801, October 1996. 2997 [MA] Ma, Q., "Quality of Service Routing in Integrated Services 2998 Networks", Ph.D. PhD Dissertation, CMU-CS-98-138, CMU, 2999 1998. 3001 [MATE] Elwalid, A., Jin, C., Low, S., and I. Widjaja, "MATE - 3002 MPLS Adaptive Traffic Engineering", 3003 Proceedings INFOCOM'01, April 2001. 3005 [MCQ80] McQuillan, J., Richer, I., and E. Rosen, "The New Routing 3006 Algorithm for the ARPANET", Transaction IEEE Transactions 3007 on Communications, vol. 28, no. 5, p. 711-719, May 1980. 3009 [MR99] Mitra, D. and K. Ramakrishnan, "A Case Study of 3010 Multiservice, Multipriority Traffic Engineering Design for 3011 Data Networks", Proceedings Globecom'99, December 1999. 3013 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 3014 DOI 10.17487/RFC0791, September 1981, 3015 . 3017 [RFC1102] Clark, D., "Policy routing in Internet protocols", 3018 RFC 1102, DOI 10.17487/RFC1102, May 1989, 3019 . 3021 [RFC1104] Braun, H., "Models of policy based routing", RFC 1104, 3022 DOI 10.17487/RFC1104, June 1989, 3023 . 3025 [RFC1992] Castineyra, I., Chiappa, N., and M. Steenstrup, "The 3026 Nimrod Routing Architecture", RFC 1992, 3027 DOI 10.17487/RFC1992, August 1996, 3028 . 3030 [RFC2205] Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and S. 3031 Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 3032 Functional Specification", RFC 2205, DOI 10.17487/RFC2205, 3033 September 1997, . 3035 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 3036 DOI 10.17487/RFC2328, April 1998, 3037 . 3039 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 3040 "Framework for IP Performance Metrics", RFC 2330, 3041 DOI 10.17487/RFC2330, May 1998, 3042 . 3044 [RFC2386] Crawley, E., Nair, R., Rajagopalan, B., and H. Sandick, "A 3045 Framework for QoS-based Routing in the Internet", 3046 RFC 2386, DOI 10.17487/RFC2386, August 1998, 3047 . 3049 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 3050 "Definition of the Differentiated Services Field (DS 3051 Field) in the IPv4 and IPv6 Headers", RFC 2474, 3052 DOI 10.17487/RFC2474, December 1998, 3053 . 3055 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 3056 and W. Weiss, "An Architecture for Differentiated 3057 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 3058 . 3060 [RFC2597] Heinanen, J., Baker, F., Weiss, W., and J. Wroclawski, 3061 "Assured Forwarding PHB Group", RFC 2597, 3062 DOI 10.17487/RFC2597, June 1999, 3063 . 3065 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 3066 Connectivity", RFC 2678, DOI 10.17487/RFC2678, September 3067 1999, . 3069 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and J. 3070 McManus, "Requirements for Traffic Engineering Over MPLS", 3071 RFC 2702, DOI 10.17487/RFC2702, September 1999, 3072 . 3074 [RFC2722] Brownlee, N., Mills, C., and G. Ruth, "Traffic Flow 3075 Measurement: Architecture", RFC 2722, 3076 DOI 10.17487/RFC2722, October 1999, 3077 . 3079 [RFC2753] Yavatkar, R., Pendarakis, D., and R. Guerin, "A Framework 3080 for Policy-based Admission Control", RFC 2753, 3081 DOI 10.17487/RFC2753, January 2000, 3082 . 3084 [RFC2961] Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F., 3085 and S. Molendini, "RSVP Refresh Overhead Reduction 3086 Extensions", RFC 2961, DOI 10.17487/RFC2961, April 2001, 3087 . 3089 [RFC2998] Bernet, Y., Ford, P., Yavatkar, R., Baker, F., Zhang, L., 3090 Speer, M., Braden, R., Davie, B., Wroclawski, J., and E. 3091 Felstaine, "A Framework for Integrated Services Operation 3092 over Diffserv Networks", RFC 2998, DOI 10.17487/RFC2998, 3093 November 2000, . 3095 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 3096 Label Switching Architecture", RFC 3031, 3097 DOI 10.17487/RFC3031, January 2001, 3098 . 3100 [RFC3086] Nichols, K. and B. Carpenter, "Definition of 3101 Differentiated Services Per Domain Behaviors and Rules for 3102 their Specification", RFC 3086, DOI 10.17487/RFC3086, 3103 April 2001, . 3105 [RFC3124] Balakrishnan, H. and S. Seshan, "The Congestion Manager", 3106 RFC 3124, DOI 10.17487/RFC3124, June 2001, 3107 . 3109 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 3110 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 3111 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 3112 . 3114 [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, 3115 P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- 3116 Protocol Label Switching (MPLS) Support of Differentiated 3117 Services", RFC 3270, DOI 10.17487/RFC3270, May 2002, 3118 . 3120 [RFC3272] Awduche, D., Chiu, A., Elwalid, A., Widjaja, I., and X. 3121 Xiao, "Overview and Principles of Internet Traffic 3122 Engineering", RFC 3272, DOI 10.17487/RFC3272, May 2002, 3123 . 3125 [RFC3469] Sharma, V., Ed. and F. Hellstrand, Ed., "Framework for 3126 Multi-Protocol Label Switching (MPLS)-based Recovery", 3127 RFC 3469, DOI 10.17487/RFC3469, February 2003, 3128 . 3130 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 3131 (TE) Extensions to OSPF Version 2", RFC 3630, 3132 DOI 10.17487/RFC3630, September 2003, 3133 . 3135 [RFC3945] Mannie, E., Ed., "Generalized Multi-Protocol Label 3136 Switching (GMPLS) Architecture", RFC 3945, 3137 DOI 10.17487/RFC3945, October 2004, 3138 . 3140 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 3141 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 3142 DOI 10.17487/RFC4090, May 2005, 3143 . 3145 [RFC4124] Le Faucheur, F., Ed., "Protocol Extensions for Support of 3146 Diffserv-aware MPLS Traffic Engineering", RFC 4124, 3147 DOI 10.17487/RFC4124, June 2005, 3148 . 3150 [RFC4203] Kompella, K., Ed. and Y. Rekhter, Ed., "OSPF Extensions in 3151 Support of Generalized Multi-Protocol Label Switching 3152 (GMPLS)", RFC 4203, DOI 10.17487/RFC4203, October 2005, 3153 . 3155 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 3156 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 3157 DOI 10.17487/RFC4271, January 2006, 3158 . 3160 [RFC4594] Babiarz, J., Chan, K., and F. Baker, "Configuration 3161 Guidelines for DiffServ Service Classes", RFC 4594, 3162 DOI 10.17487/RFC4594, August 2006, 3163 . 3165 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 3166 Element (PCE)-Based Architecture", RFC 4655, 3167 DOI 10.17487/RFC4655, August 2006, 3168 . 3170 [RFC4872] Lang, J., Ed., Rekhter, Y., Ed., and D. Papadimitriou, 3171 Ed., "RSVP-TE Extensions in Support of End-to-End 3172 Generalized Multi-Protocol Label Switching (GMPLS) 3173 Recovery", RFC 4872, DOI 10.17487/RFC4872, May 2007, 3174 . 3176 [RFC4873] Berger, L., Bryskin, I., Papadimitriou, D., and A. Farrel, 3177 "GMPLS Segment Recovery", RFC 4873, DOI 10.17487/RFC4873, 3178 May 2007, . 3180 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 3181 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 3182 2008, . 3184 [RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream 3185 Label Assignment and Context-Specific Label Space", 3186 RFC 5331, DOI 10.17487/RFC5331, August 2008, 3187 . 3189 [RFC5394] Bryskin, I., Papadimitriou, D., Berger, L., and J. Ash, 3190 "Policy-Enabled Path Computation Framework", RFC 5394, 3191 DOI 10.17487/RFC5394, December 2008, 3192 . 3194 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 3195 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 3196 DOI 10.17487/RFC5440, March 2009, 3197 . 3199 [RFC5575] Marques, P., Sheth, N., Raszuk, R., Greene, B., Mauch, J., 3200 and D. McPherson, "Dissemination of Flow Specification 3201 Rules", RFC 5575, DOI 10.17487/RFC5575, August 2009, 3202 . 3204 [RFC6119] Harrison, J., Berger, J., and M. Bartlett, "IPv6 Traffic 3205 Engineering in IS-IS", RFC 6119, DOI 10.17487/RFC6119, 3206 February 2011, . 3208 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 3209 and A. Bierman, Ed., "Network Configuration Protocol 3210 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 3211 . 3213 [RFC6374] Frost, D. and S. Bryant, "Packet Loss and Delay 3214 Measurement for MPLS Networks", RFC 6374, 3215 DOI 10.17487/RFC6374, September 2011, 3216 . 3218 [RFC6805] King, D., Ed. and A. Farrel, Ed., "The Application of the 3219 Path Computation Element Architecture to the Determination 3220 of a Sequence of Domains in MPLS and GMPLS", RFC 6805, 3221 DOI 10.17487/RFC6805, November 2012, 3222 . 3224 [RFC7149] Boucadair, M. and C. Jacquenet, "Software-Defined 3225 Networking: A Perspective from within a Service Provider 3226 Environment", RFC 7149, DOI 10.17487/RFC7149, March 2014, 3227 . 3229 [RFC7390] Rahman, A., Ed. and E. Dijk, Ed., "Group Communication for 3230 the Constrained Application Protocol (CoAP)", RFC 7390, 3231 DOI 10.17487/RFC7390, October 2014, 3232 . 3234 [RFC7471] Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 3235 Previdi, "OSPF Traffic Engineering (TE) Metric 3236 Extensions", RFC 7471, DOI 10.17487/RFC7471, March 2015, 3237 . 3239 [RFC7679] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3240 Ed., "A One-Way Delay Metric for IP Performance Metrics 3241 (IPPM)", STD 81, RFC 7679, DOI 10.17487/RFC7679, January 3242 2016, . 3244 [RFC7680] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3245 Ed., "A One-Way Loss Metric for IP Performance Metrics 3246 (IPPM)", STD 82, RFC 7680, DOI 10.17487/RFC7680, January 3247 2016, . 3249 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 3250 S. Ray, "North-Bound Distribution of Link-State and 3251 Traffic Engineering (TE) Information Using BGP", RFC 7752, 3252 DOI 10.17487/RFC7752, March 2016, 3253 . 3255 [RFC7810] Previdi, S., Ed., Giacalone, S., Ward, D., Drake, J., and 3256 Q. Wu, "IS-IS Traffic Engineering (TE) Metric Extensions", 3257 RFC 7810, DOI 10.17487/RFC7810, May 2016, 3258 . 3260 [RFC7923] Voit, E., Clemm, A., and A. Gonzalez Prieto, "Requirements 3261 for Subscription to YANG Datastores", RFC 7923, 3262 DOI 10.17487/RFC7923, June 2016, 3263 . 3265 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 3266 RFC 7950, DOI 10.17487/RFC7950, August 2016, 3267 . 3269 [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF 3270 Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, 3271 . 3273 [RFC8051] Zhang, X., Ed. and I. Minei, Ed., "Applicability of a 3274 Stateful Path Computation Element (PCE)", RFC 8051, 3275 DOI 10.17487/RFC8051, January 2017, 3276 . 3278 [RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path 3279 Computation Element Communication Protocol (PCEP) 3280 Extensions for Stateful PCE", RFC 8231, 3281 DOI 10.17487/RFC8231, September 2017, 3282 . 3284 [RFC8281] Crabbe, E., Minei, I., Sivabalan, S., and R. Varga, "Path 3285 Computation Element Communication Protocol (PCEP) 3286 Extensions for PCE-Initiated LSP Setup in a Stateful PCE 3287 Model", RFC 8281, DOI 10.17487/RFC8281, December 2017, 3288 . 3290 [RFC8283] Farrel, A., Ed., Zhao, Q., Ed., Li, Z., and C. Zhou, "An 3291 Architecture for Use of PCE and the PCE Communication 3292 Protocol (PCEP) in a Network with Central Control", 3293 RFC 8283, DOI 10.17487/RFC8283, December 2017, 3294 . 3296 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 3297 Decraene, B., Litkowski, S., and R. Shakir, "Segment 3298 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 3299 July 2018, . 3301 [RFC8453] Ceccarelli, D., Ed. and Y. Lee, Ed., "Framework for 3302 Abstraction and Control of TE Networks (ACTN)", RFC 8453, 3303 DOI 10.17487/RFC8453, August 2018, 3304 . 3306 [RFC8571] Ginsberg, L., Ed., Previdi, S., Wu, Q., Tantsura, J., and 3307 C. Filsfils, "BGP - Link State (BGP-LS) Advertisement of 3308 IGP Traffic Engineering Performance Metric Extensions", 3309 RFC 8571, DOI 10.17487/RFC8571, March 2019, 3310 . 3312 [RFC8661] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., 3313 Decraene, B., and S. Litkowski, "Segment Routing MPLS 3314 Interworking with LDP", RFC 8661, DOI 10.17487/RFC8661, 3315 December 2019, . 3317 [RR94] Rodrigues, M. and K. Ramakrishnan, "Optimal Routing in 3318 Shortest Path Networks", Proceedings ITS'94, Rio de 3319 Janeiro, Brazil, 1994. 3321 [SLDC98] Suter, B., Lakshman, T., Stiliadis, D., and A. Choudhury, 3322 "Design Considerations for Supporting TCP with Per-flow 3323 Queueing", Proceedings INFOCOM'98, p. 299-306, 1998. 3325 [WANG] Wang, Y., Wang, Z., and L. Zhang, "Internet traffic 3326 engineering without full mesh overlaying", 3327 Proceedings INFOCOM'2001, April 2001. 3329 [XIAO] Xiao, X., Hannan, A., Bailey, B., and L. Ni, "Traffic 3330 Engineering with MPLS in the Internet", Article IEEE 3331 Network Magazine, March 2000. 3333 [YARE95] Yang, C. and A. Reddy, "A Taxonomy for Congestion Control 3334 Algorithms in Packet Switching Networks", Article IEEE 3335 Network Magazine, p. 34-45, 1995. 3337 Appendix A. Historic Overview 3339 A.1. Traffic Engineering in Classical Telephone Networks 3341 This subsection presents a brief overview of traffic engineering in 3342 telephone networks which often relates to the way user traffic is 3343 steered from an originating node to the terminating node. This 3344 subsection presents a brief overview of this topic. A detailed 3345 description of the various routing strategies applied in telephone 3346 networks is included in the book by G. Ash [ASH2]. 3348 The early telephone network relied on static hierarchical routing, 3349 whereby routing patterns remained fixed independent of the state of 3350 the network or time of day. The hierarchy was intended to 3351 accommodate overflow traffic, improve network reliability via 3352 alternate routes, and prevent call looping by employing strict 3353 hierarchical rules. The network was typically over-provisioned since 3354 a given fixed route had to be dimensioned so that it could carry user 3355 traffic during a busy hour of any busy day. Hierarchical routing in 3356 the telephony network was found to be too rigid upon the advent of 3357 digital switches and stored program control which were able to manage 3358 more complicated traffic engineering rules. 3360 Dynamic routing was introduced to alleviate the routing inflexibility 3361 in the static hierarchical routing so that the network would operate 3362 more efficiently. This resulted in significant economic gains 3363 [HUSS87]. Dynamic routing typically reduces the overall loss 3364 probability by 10 to 20 percent (compared to static hierarchical 3365 routing). Dynamic routing can also improve network resilience by 3366 recalculating routes on a per-call basis and periodically updating 3367 routes. 3369 There are three main types of dynamic routing in the telephone 3370 network. They are time-dependent routing, state-dependent routing 3371 (SDR), and event dependent routing (EDR). 3373 In time-dependent routing, regular variations in traffic loads (such 3374 as time of day or day of week) are exploited in pre-planned routing 3375 tables. In state-dependent routing, routing tables are updated 3376 online according to the current state of the network (e.g., traffic 3377 demand, utilization, etc.). In event dependent routing, routing 3378 changes are incepted by events (such as call setups encountering 3379 congested or blocked links) whereupon new paths are searched out 3380 using learning models. EDR methods are real-time adaptive, but they 3381 do not require global state information as does SDR. Examples of EDR 3382 schemes include the dynamic alternate routing (DAR) from BT, the 3383 state-and-time dependent routing (STR) from NTT, and the success-to- 3384 the-top (STT) routing from AT&T. 3386 Dynamic non-hierarchical routing (DNHR) is an example of dynamic 3387 routing that was introduced in the AT&T toll network in the 1980's to 3388 respond to time-dependent information such as regular load variations 3389 as a function of time. Time-dependent information in terms of load 3390 may be divided into three timescales: hourly, weekly, and yearly. 3391 Correspondingly, three algorithms are defined to pre-plan the routing 3392 tables. The network design algorithm operates over a year-long 3393 interval while the demand servicing algorithm operates on a weekly 3394 basis to fine tune link sizes and routing tables to correct forecast 3395 errors on the yearly basis. At the smallest timescale, the routing 3396 algorithm is used to make limited adjustments based on daily traffic 3397 variations. Network design and demand servicing are computed using 3398 offline calculations. Typically, the calculations require extensive 3399 searches on possible routes. On the other hand, routing may need 3400 online calculations to handle crankback. DNHR adopts a "two-link" 3401 approach whereby a path can consist of two links at most. The 3402 routing algorithm presents an ordered list of route choices between 3403 an originating switch and a terminating switch. If a call overflows, 3404 a via switch (a tandem exchange between the originating switch and 3405 the terminating switch) would send a crankback signal to the 3406 originating switch. This switch would then select the next route, 3407 and so on, until there are no alternative routes available in which 3408 the call is blocked. 3410 A.2. Evolution of Traffic Engineering in Packet Networks 3412 This subsection reviews related prior work that was intended to 3413 improve the performance of data networks. Indeed, optimization of 3414 the performance of data networks started in the early days of the 3415 ARPANET. Other early commercial networks such as SNA also recognized 3416 the importance of performance optimization and service 3417 differentiation. 3419 In terms of traffic management, the Internet has been a best effort 3420 service environment until recently. In particular, very limited 3421 traffic management capabilities existed in IP networks to provide 3422 differentiated queue management and scheduling services to packets 3423 belonging to different classes. 3425 In terms of routing control, the Internet has employed distributed 3426 protocols for intra-domain routing. These protocols are highly 3427 scalable and resilient. However, they are based on simple algorithms 3428 for path selection which have very limited functionality to allow 3429 flexible control of the path selection process. 3431 In the following subsections, the evolution of practical traffic 3432 engineering mechanisms in IP networks and its predecessors are 3433 reviewed. 3435 A.2.1. Adaptive Routing in the ARPANET 3437 The early ARPANET recognized the importance of adaptive routing where 3438 routing decisions were based on the current state of the network 3439 [MCQ80]. Early minimum delay routing approaches forwarded each 3440 packet to its destination along a path for which the total estimated 3441 transit time was the smallest. Each node maintained a table of 3442 network delays, representing the estimated delay that a packet would 3443 experience along a given path toward its destination. The minimum 3444 delay table was periodically transmitted by a node to its neighbors. 3445 The shortest path, in terms of hop count, was also propagated to give 3446 the connectivity information. 3448 One drawback to this approach is that dynamic link metrics tend to 3449 create "traffic magnets" causing congestion to be shifted from one 3450 location of a network to another location, resulting in oscillation 3451 and network instability. 3453 A.2.2. Dynamic Routing in the Internet 3455 The Internet evolved from the ARPANET and adopted dynamic routing 3456 algorithms with distributed control to determine the paths that 3457 packets should take en-route to their destinations. The routing 3458 algorithms are adaptations of shortest path algorithms where costs 3459 are based on link metrics. The link metric can be based on static or 3460 dynamic quantities. The link metric based on static quantities may 3461 be assigned administratively according to local criteria. The link 3462 metric based on dynamic quantities may be a function of a network 3463 congestion measure such as delay or packet loss. 3465 It was apparent early that static link metric assignment was 3466 inadequate because it can easily lead to unfavorable scenarios in 3467 which some links become congested while others remain lightly loaded. 3468 One of the many reasons for the inadequacy of static link metrics is 3469 that link metric assignment was often done without considering the 3470 traffic matrix in the network. Also, the routing protocols did not 3471 take traffic attributes and capacity constraints into account when 3472 making routing decisions. This results in traffic concentration 3473 being localized in subsets of the network infrastructure and 3474 potentially causing congestion. Even if link metrics are assigned in 3475 accordance with the traffic matrix, unbalanced loads in the network 3476 can still occur due to a number factors including: 3478 o Resources may not be deployed in the most optimal locations from a 3479 routing perspective. 3481 o Forecasting errors in traffic volume and/or traffic distribution. 3483 o Dynamics in traffic matrix due to the temporal nature of traffic 3484 patterns, BGP policy change from peers, etc. 3486 The inadequacy of the legacy Internet interior gateway routing system 3487 is one of the factors motivating the interest in path oriented 3488 technology with explicit routing and constraint-based routing 3489 capability such as MPLS. 3491 A.2.3. ToS Routing 3493 Type-of-Service (ToS) routing involves different routes going to the 3494 same destination with selection dependent upon the ToS field of an IP 3495 packet [RFC2474]. The ToS classes may be classified as low delay and 3496 high throughput. Each link is associated with multiple link costs 3497 and each link cost is used to compute routes for a particular ToS. A 3498 separate shortest path tree is computed for each ToS. The shortest 3499 path algorithm must be run for each ToS resulting in very expensive 3500 computation. Classical ToS-based routing is now outdated as the IP 3501 header field has been replaced by a Diffserv field. Effective 3502 traffic engineering is difficult to perform in classical ToS-based 3503 routing because each class still relies exclusively on shortest path 3504 routing which results in localization of traffic concentration within 3505 the network. 3507 A.2.4. Equal Cost Multi-Path 3509 Equal Cost Multi-Path (ECMP) is another technique that attempts to 3510 address the deficiency in the Shortest Path First (SPF) interior 3511 gateway routing systems [RFC2328]. In the classical SPF algorithm, 3512 if two or more shortest paths exist to a given destination, the 3513 algorithm will choose one of them. The algorithm is modified 3514 slightly in ECMP so that if two or more equal cost shortest paths 3515 exist between two nodes, the traffic between the nodes is distributed 3516 among the multiple equal-cost paths. Traffic distribution across the 3517 equal-cost paths is usually performed in one of two ways: (1) packet- 3518 based in a round-robin fashion, or (2) flow-based using hashing on 3519 source and destination IP addresses and possibly other fields of the 3520 IP header. The first approach can easily cause out- of-order packets 3521 while the second approach is dependent upon the number and 3522 distribution of flows. Flow-based load sharing may be unpredictable 3523 in an enterprise network where the number of flows is relatively 3524 small and less heterogeneous (for example, hashing may not be 3525 uniform), but it is generally effective in core public networks where 3526 the number of flows is large and heterogeneous. 3528 In ECMP, link costs are static and bandwidth constraints are not 3529 considered, so ECMP attempts to distribute the traffic as equally as 3530 possible among the equal-cost paths independent of the congestion 3531 status of each path. As a result, given two equal-cost paths, it is 3532 possible that one of the paths will be more congested than the other. 3533 Another drawback of ECMP is that load sharing cannot be achieved on 3534 multiple paths which have non-identical costs. 3536 A.2.5. Nimrod 3538 Nimrod was a routing system developed to provide heterogeneous 3539 service specific routing in the Internet, while taking multiple 3540 constraints into account [RFC1992]. Essentially, Nimrod was a link 3541 state routing protocol to support path oriented packet forwarding. 3542 It used the concept of maps to represent network connectivity and 3543 services at multiple levels of abstraction. Mechanisms allowed 3544 restriction of the distribution of routing information. 3546 Even though Nimrod did not enjoy deployment in the public Internet, a 3547 number of key concepts incorporated into the Nimrod architecture, 3548 such as explicit routing which allows selection of paths at 3549 originating nodes, are beginning to find applications in some recent 3550 constraint-based routing initiatives. 3552 A.3. Development of Internet Traffic Engineering 3554 A.3.1. Overlay Model 3556 In the overlay model, a virtual-circuit network, such as Sonet/SDH, 3557 OTN, or WDM, provides virtual-circuit connectivity between routers 3558 that are located at the edges of a virtual-circuit cloud. In this 3559 mode, two routers that are connected through a virtual circuit see a 3560 direct adjacency between themselves independent of the physical route 3561 taken by the virtual circuit through the ATM, frame relay, or WDM 3562 network. Thus, the overlay model essentially decouples the logical 3563 topology that routers see from the physical topology that the ATM, 3564 frame relay, or WDM network manages. The overlay model based on ATM 3565 or frame relay enables a network administrator or an automaton to 3566 employ traffic engineering concepts to perform path optimization by 3567 re-configuring or rearranging the virtual circuits so that a virtual 3568 circuit on a congested or sub-optimal physical link can be re-routed 3569 to a less congested or more optimal one. In the overlay model, 3570 traffic engineering is also employed to establish relationships 3571 between the traffic management parameters (e.g., PCR, SCR, and MBS 3572 for ATM) of the virtual-circuit technology and the actual traffic 3573 that traverses each circuit. These relationships can be established 3574 based upon known or projected traffic profiles, and some other 3575 factors. 3577 Appendix B. Overview of Traffic Engineering Related Work in Other SDOs 3579 B.1. Overview of ITU Activities Related to Traffic Engineering 3581 This section provides an overview of prior work within the ITU-T 3582 pertaining to traffic engineering in traditional telecommunications 3583 networks. 3585 ITU-T Recommendations E.600 [ITU-E600], E.701 [ITU-E701], and E.801 3586 [ITU-E801] address traffic engineering issues in traditional 3587 telecommunications networks. Recommendation E.600 provides a 3588 vocabulary for describing traffic engineering concepts, while E.701 3589 defines reference connections, Grade of Service (GOS), and traffic 3590 parameters for ISDN. Recommendation E.701 uses the concept of a 3591 reference connection to identify representative cases of different 3592 types of connections without describing the specifics of their actual 3593 realizations by different physical means. As defined in 3594 Recommendation E.600, "a connection is an association of resources 3595 providing means for communication between two or more devices in, or 3596 attached to, a telecommunication network." Also, E.600 defines "a 3597 resource as any set of physically or conceptually identifiable 3598 entities within a telecommunication network, the use of which can be 3599 unambiguously determined" [ITU-E600]. There can be different types 3600 of connections as the number and types of resources in a connection 3601 may vary. 3603 Typically, different network segments are involved in the path of a 3604 connection. For example, a connection may be local, national, or 3605 international. The purposes of reference connections are to clarify 3606 and specify traffic performance issues at various interfaces between 3607 different network domains. Each domain may consist of one or more 3608 service provider networks. 3610 Reference connections provide a basis to define grade of service 3611 (GoS) parameters related to traffic engineering within the ITU-T 3612 framework. As defined in E.600, "GoS refers to a number of traffic 3613 engineering variables which are used to provide a measure of the 3614 adequacy of a group of resources under specified conditions." These 3615 GoS variables may be probability of loss, dial tone, delay, etc. 3616 They are essential for network internal design and operation as well 3617 as for component performance specification. 3619 GoS is different from quality of service (QoS) in the ITU framework. 3620 QoS is the performance perceivable by a telecommunication service 3621 user and expresses the user's degree of satisfaction of the service. 3622 QoS parameters focus on performance aspects observable at the service 3623 access points and network interfaces, rather than their causes within 3624 the network. GoS, on the other hand, is a set of network oriented 3625 measures which characterize the adequacy of a group of resources 3626 under specified conditions. For a network to be effective in serving 3627 its users, the values of both GoS and QoS parameters must be related, 3628 with GoS parameters typically making a major contribution to the QoS. 3630 Recommendation E.600 stipulates that a set of GoS parameters must be 3631 selected and defined on an end-to-end basis for each major service 3632 category provided by a network to assist the network provider with 3633 improving efficiency and effectiveness of the network. Based on a 3634 selected set of reference connections, suitable target values are 3635 assigned to the selected GoS parameters under normal and high load 3636 conditions. These end-to-end GoS target values are then apportioned 3637 to individual resource components of the reference connections for 3638 dimensioning purposes. 3640 Appendix C. Summary of Changes Since RFC 3272 3642 This section is a place-holder. It is expected that once work on 3643 this document is nearly complete, this section will be updated to 3644 provide an overview of the structural and substantive changed from 3645 RFC 3272. 3647 Author's Address 3649 Adrian Farrel (editor) 3650 Old Dog Consulting 3652 Email: adrian@olddog.co.uk