idnits 2.17.1 draft-ietf-teas-rfc3272bis-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 15, 2020) is 1257 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 3272 (Obsoleted by RFC 9522) -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group A. Farrel, Ed. 3 Internet-Draft Old Dog Consulting 4 Obsoletes: 3272 (if approved) November 15, 2020 5 Intended status: Informational 6 Expires: May 19, 2021 8 Overview and Principles of Internet Traffic Engineering 9 draft-ietf-teas-rfc3272bis-03 11 Abstract 13 This document describes the principles of traffic engineering (TE) in 14 the Internet. The document is intended to promote better 15 understanding of the issues surrounding traffic engineering in IP 16 networks and the networks that support IP networking, and to provide 17 a common basis for the development of traffic engineering 18 capabilities for the Internet. The principles, architectures, and 19 methodologies for performance evaluation and performance optimization 20 of operational networks are also discussed. 22 This work was first published as RFC 3272 in May 2002. This document 23 obsoletes RFC 3272 by making a complete update to bring the text in 24 line with best current practices for Internet traffic engineering and 25 to include references to the latest relevant work in the IETF. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on May 19, 2021. 44 Copyright Notice 46 Copyright (c) 2020 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (https://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 62 1.1. What is Internet Traffic Engineering? . . . . . . . . . . 4 63 1.2. Components of Traffic Engineering . . . . . . . . . . . . 6 64 1.3. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 8 66 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 11 67 2.1. Context of Internet Traffic Engineering . . . . . . . . . 11 68 2.2. Network Domain Context . . . . . . . . . . . . . . . . . 12 69 2.3. Problem Context . . . . . . . . . . . . . . . . . . . . . 14 70 2.3.1. Congestion and its Ramifications . . . . . . . . . . 15 71 2.4. Solution Context . . . . . . . . . . . . . . . . . . . . 15 72 2.4.1. Combating the Congestion Problem . . . . . . . . . . 17 73 2.5. Implementation and Operational Context . . . . . . . . . 20 74 3. Traffic Engineering Process Models . . . . . . . . . . . . . 21 75 3.1. Components of the Traffic Engineering Process Model . . . 21 76 4. Review of TE Techniques . . . . . . . . . . . . . . . . . . . 22 77 4.1. Overview of IETF Projects Related to Traffic Engineering 22 78 4.1.1. Constraint-Based Routing . . . . . . . . . . . . . . 22 79 4.1.2. Integrated Services . . . . . . . . . . . . . . . . . 23 80 4.1.3. RSVP . . . . . . . . . . . . . . . . . . . . . . . . 24 81 4.1.4. Differentiated Services . . . . . . . . . . . . . . . 25 82 4.1.5. QUIC . . . . . . . . . . . . . . . . . . . . . . . . 26 83 4.1.6. Multiprotocol Label Switching (MPLS) . . . . . . . . 26 84 4.1.7. Generalized MPLS . . . . . . . . . . . . . . . . . . 26 85 4.1.8. IP Performance Metrics . . . . . . . . . . . . . . . 27 86 4.1.9. Flow Measurement . . . . . . . . . . . . . . . . . . 27 87 4.1.10. Endpoint Congestion Management . . . . . . . . . . . 28 88 4.1.11. TE Extensions to the IGPs . . . . . . . . . . . . . . 28 89 4.1.12. Link-State BGP . . . . . . . . . . . . . . . . . . . 28 90 4.1.13. Path Computation Element . . . . . . . . . . . . . . 29 91 4.1.14. Application-Layer Traffic Optimization . . . . . . . 29 92 4.1.15. Segment Routing with MPLS encapsuation (SR-MPLS) . . 29 93 4.1.16. Network Virtualization and Abstraction . . . . . . . 31 94 4.1.17. Network Slicing . . . . . . . . . . . . . . . . . . . 31 95 4.1.18. Deterministic Networking . . . . . . . . . . . . . . 32 96 4.1.19. Network TE State Definition and Presentation . . . . 32 97 4.1.20. System Management and Control Interfaces . . . . . . 32 98 4.2. Content Distribution . . . . . . . . . . . . . . . . . . 32 99 5. Taxonomy of Traffic Engineering Systems . . . . . . . . . . . 33 100 5.1. Time-Dependent Versus State-Dependent Versus Event 101 Dependent . . . . . . . . . . . . . . . . . . . . . . . . 34 102 5.2. Offline Versus Online . . . . . . . . . . . . . . . . . . 35 103 5.3. Centralized Versus Distributed . . . . . . . . . . . . . 35 104 5.3.1. Hybrid Systems . . . . . . . . . . . . . . . . . . . 36 105 5.3.2. Considerations for Software Defined Networking . . . 36 106 5.4. Local Versus Global . . . . . . . . . . . . . . . . . . . 36 107 5.5. Prescriptive Versus Descriptive . . . . . . . . . . . . . 36 108 5.5.1. Intent-Based Networking . . . . . . . . . . . . . . . 37 109 5.6. Open-Loop Versus Closed-Loop . . . . . . . . . . . . . . 37 110 5.7. Tactical versus Strategic . . . . . . . . . . . . . . . . 37 111 6. Recommendations for Internet Traffic Engineering . . . . . . 37 112 6.1. Generic Non-functional Recommendations . . . . . . . . . 38 113 6.2. Routing Recommendations . . . . . . . . . . . . . . . . . 40 114 6.3. Traffic Mapping Recommendations . . . . . . . . . . . . . 42 115 6.4. Measurement Recommendations . . . . . . . . . . . . . . . 43 116 6.5. Network Survivability . . . . . . . . . . . . . . . . . . 44 117 6.5.1. Survivability in MPLS Based Networks . . . . . . . . 46 118 6.5.2. Protection Option . . . . . . . . . . . . . . . . . . 47 119 6.6. Traffic Engineering in Diffserv Environments . . . . . . 48 120 6.7. Network Controllability . . . . . . . . . . . . . . . . . 50 121 7. Inter-Domain Considerations . . . . . . . . . . . . . . . . . 51 122 8. Overview of Contemporary TE Practices in Operational IP 123 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 53 124 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 57 125 10. Security Considerations . . . . . . . . . . . . . . . . . . . 57 126 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 57 127 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 57 128 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 59 129 14. Informative References . . . . . . . . . . . . . . . . . . . 60 130 Appendix A. Historic Overview . . . . . . . . . . . . . . . . . 69 131 A.1. Traffic Engineering in Classical Telephone Networks . . . 69 132 A.2. Evolution of Traffic Engineering in Packet Networks . . . 70 133 A.2.1. Adaptive Routing in the ARPANET . . . . . . . . . . . 71 134 A.2.2. Dynamic Routing in the Internet . . . . . . . . . . . 71 135 A.2.3. ToS Routing . . . . . . . . . . . . . . . . . . . . . 72 136 A.2.4. Equal Cost Multi-Path . . . . . . . . . . . . . . . . 72 137 A.2.5. Nimrod . . . . . . . . . . . . . . . . . . . . . . . 73 138 A.3. Development of Internet Traffic Engineering . . . . . . . 73 139 A.3.1. Overlay Model . . . . . . . . . . . . . . . . . . . . 73 140 Appendix B. Overview of Traffic Engineering Related Work in 141 Other SDOs . . . . . . . . . . . . . . . . . . . . . 74 142 B.1. Overview of ITU Activities Related to Traffic Engineering 74 143 Appendix C. Summary of Changes Since RFC 3272 . . . . . . . . . 75 144 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 75 146 1. Introduction 148 This document describes the principles of Internet traffic 149 engineering (TE). The objective of the document is to articulate the 150 general issues and principles for Internet traffic engineering, and 151 where appropriate to provide recommendations, guidelines, and options 152 for the development of online and offline Internet traffic 153 engineering capabilities and support systems. 155 This document provides a terminology and taxonomy for describing and 156 understanding common Internet traffic engineering concepts. 158 Even though Internet traffic engineering is most effective when 159 applied end-to-end, the focus of this document is traffic engineering 160 within a given domain (such as an autonomous system). However, 161 because a preponderance of Internet traffic tends to originate in one 162 autonomous system and terminate in another, this document also 163 provides an overview of aspects pertaining to inter-domain traffic 164 engineering. 166 This work was first published as [RFC3272] in May 2002. This 167 document obsoletes [RFC3272] by making a complete update to bring the 168 text in line with best current practices for Internet traffic 169 engineering and to include references to the latest relevant work in 170 the IETF. It is worth noting around three fifths of the RFCs 171 referenced in this document post-date the publication of RFC 3272. 172 Appendix C provides a summary of changes between RFC 3272 and this 173 document. 175 1.1. What is Internet Traffic Engineering? 177 One of the most significant functions performed by the Internet is 178 the routing of traffic from ingress nodes to egress nodes. 179 Therefore, one of the most distinctive functions performed by 180 Internet traffic engineering is the control and optimization of the 181 routing function, to steer traffic through the network. 183 Internet traffic engineering is defined as that aspect of Internet 184 network engineering dealing with the issues of performance evaluation 185 and performance optimization of operational IP networks. Traffic 186 engineering encompasses the application of technology and scientific 187 principles to the measurement, characterization, modeling, and 188 control of Internet traffic [RFC2702], [AWD2]. 190 It is the performance of the network as seen by end users of network 191 services that is paramount. The characteristics visible to end users 192 are the emergent properties of the network, which are the 193 characteristics of the network when viewed as a whole. A central 194 goal of the service provider, therefore, is to enhance the emergent 195 properties of the network while taking economic considerations into 196 account. This is accomplished by addressing traffic oriented 197 performance requirements while utilizing network resources 198 economically and reliably. Traffic oriented performance measures 199 include delay, delay variation, packet loss, and throughput. 201 Internet traffic engineering responds to network events. Aspects of 202 capacity management respond at intervals ranging from days to years. 203 Routing control functions operate at intervals ranging from 204 milliseconds to days. Packet level processing functions operate at 205 very fine levels of temporal resolution, ranging from picoseconds to 206 milliseconds while reacting to the real-time statistical behavior of 207 traffic. 209 Thus, the optimization aspects of traffic engineering can be viewed 210 from a control perspective, and can be both pro-active and reactive. 211 In the pro-active case, the traffic engineering control system takes 212 preventive action to protect against predicted unfavorable future 213 network states, for example, by engineering backup paths. It may 214 also take action that will lead to a more desirable future network 215 state. In the reactive case, the control system responds to correct 216 issues and adapt to network events, such as routing after failure. 218 Another important objective of Internet traffic engineering is to 219 facilitate reliable network operations [RFC2702]. Reliable network 220 operations can be facilitated by providing mechanisms that enhance 221 network integrity and by embracing policies emphasizing network 222 survivability. This reduces the vulnerability of services to outages 223 arising from errors, faults, and failures occurring within the 224 network infrastructure. 226 The optimization aspects of traffic engineering can be achieved 227 through capacity management and traffic management. In this 228 document, capacity management includes capacity planning, routing 229 control, and resource management. Network resources of particular 230 interest include link bandwidth, buffer space, and computational 231 resources. In this document, traffic management includes: 233 1. nodal traffic control functions such as traffic conditioning, 234 queue management, scheduling 236 2. other functions that regulate traffic flow through the network or 237 that arbitrate access to network resources between different 238 packets or between different traffic streams. 240 One major challenge of Internet traffic engineering is the 241 realization of automated control capabilities that adapt quickly and 242 cost effectively to significant changes in network state, while still 243 maintaining stability of the network. Performance evaluation can 244 assess the effectiveness of traffic engineering methods, and the 245 results of this evaluation can be used to identify existing problems, 246 guide network re-optimization, and aid in the prediction of potential 247 future problems. However, this process can also be time consuming 248 and may not be suitable to act on short-lived changes in the network. 250 Performance evaluation can be achieved in many different ways. The 251 most notable techniques include analytical methods, simulation, and 252 empirical methods based on measurements. 254 Traffic engineering comes in two flavors: either a background process 255 that constantly monitors traffic and optimizes the use of resources 256 to improve performance; or a form of a pre-planned optimized traffic 257 distribution that is considered optimal. In the later case, any 258 deviation from the optimum distribution (e.g., caused by a fiber cut) 259 is reverted upon repair without further optimization. However, this 260 form of traffic engineering relies upon the notion that the planned 261 state of the network is optimal. Hence, in such a mode there are two 262 levels of traffic engineering: the TE-planning task to enable optimum 263 traffic distribution, and the routing task keeping traffic flows 264 attached to the pre-planned distribution. 266 As a general rule, traffic engineering concepts and mechanisms must 267 be sufficiently specific and well-defined to address known 268 requirements, but simultaneously flexible and extensible to 269 accommodate unforeseen future demands. 271 1.2. Components of Traffic Engineering 273 As mentioned in Section 1.1, Internet traffic engineering provides 274 performance optimization of operational IP networks while utilizing 275 network resources economically and reliably. Such optimization is 276 supported at the control/controller level and within the data/ 277 forwarding plane. 279 The key elements required in any TE solution are as follows: 281 1. Policy 283 2. Path steering 285 3. Resource management 287 Some TE solutions rely on these elements to a lesser or greater 288 extent. Debate remains about whether a solution can truly be called 289 traffic engineering if it does not include all of these elements. 291 For the sake of this document, we assert that all TE solutions must 292 include some aspects of all of these elements. Other solutions can 293 be classed as "partial TE" and also fall in scope of this document. 295 Policy allows for the selection of next hops and paths based on 296 information beyond basic reachability. Early definitions of routing 297 policy, e.g., [RFC1102] and [RFC1104], discuss routing policy being 298 applied to restrict access to network resources at an aggregate 299 level. BGP is an example of a commonly used mechanism for applying 300 such policies, see [RFC4271] and [I-D.ietf-idr-rfc5575bis]. In the 301 traffic engineering context, policy decisions are made within the 302 control plane or by controllers, and govern the selection of paths. 303 Examples can be found in [RFC4655] and [RFC5394]. Standard TE 304 solutions may cover the mechanisms to distribute and/or enforce 305 polices, but specific policy definition is generally unspecified. 307 Path steering is the ability to forward packets using more 308 information than just knowledge of the next hop. Examples of path 309 steering include IPv4 source routes [RFC0791], RSVP-TE explicit 310 routes [RFC3209], and Segment Routing [RFC8402]. Path steering for 311 TE can be supported via control plane protocols, by encoding in the 312 data plane headers, or by a combination of the two. This includes 313 when control is provided by a controller using a southbound (i.e., 314 controller to router) control protocol. 316 Resource management provides resource aware control and forwarding. 317 Examples of resources are bandwidth, buffers, and queues, all of 318 which can be managed to control loss and latency. 320 Resource reservation is the control aspect of resource management. 321 It provides for domain-wide consensus about which network 322 resources are used by a particular flow. This determination may 323 be made at a very course or very fine level. Note that this 324 consensus exists at the network control or controller level, not 325 within the data plane. It may be composed purely of accounting/ 326 bookkeeping, but it typically includes an ability to admit, 327 reject, or reclassify a flow based on policy. Such accounting can 328 be done based on any combination of a static understanding of 329 resource requirements, and the use of dynamic mechanisms to 330 collect requirements (e.g., via [RFC3209]) and resource 331 availability (e.g., via [RFC4203]). 333 Resource allocation is the data plane aspect of resource 334 management. It provides for the allocation of specific node and 335 link resources to specific flows. Example resources include 336 buffers, policing, and rate-shaping mechanisms that are typically 337 supported via queuing. It also includes the matching of a flow 338 (i.e., flow classification) to a particular set of allocated 339 resources. The method of flow classification and granularity of 340 resource management is technology specific. Examples include 341 DiffServ with dropping and remarking [RFC4594], MPLS-TE [RFC3209], 342 and GMPLS based label switched paths [RFC3945], as well as 343 controller-based solutions [RFC8453]. This level of resource 344 control, while optional, is important in networks that wish to 345 support congestion management policies to control or regulate the 346 offered traffic to deliver different levels of service and 347 alleviate congestion problems, or those networks that wish to 348 control latencies experienced by specific traffic flows. 350 1.3. Scope 352 The scope of this document is intra-domain traffic engineering. That 353 is, traffic engineering within a given autonomous system in the 354 Internet. This document discusses concepts pertaining to intra- 355 domain traffic control, including such issues as routing control, 356 micro and macro resource allocation, and the control coordination 357 problems that arise consequently. 359 This document describes and characterizes techniques already in use 360 or in advanced development for Internet traffic engineering. The way 361 these techniques fit together is discussed and scenarios in which 362 they are useful will be identified. 364 Although the emphasis in this document is on intra-domain traffic 365 engineering, in Section 7, an overview of the high level 366 considerations pertaining to inter-domain traffic engineering will be 367 provided. Inter-domain Internet traffic engineering is crucial to 368 the performance enhancement of the global Internet infrastructure. 370 Whenever possible, relevant requirements from existing IETF documents 371 and other sources are incorporated by reference. 373 1.4. Terminology 375 This section provides terminology which is useful for Internet 376 traffic engineering. The definitions presented apply to this 377 document. These terms may have other meanings elsewhere. 379 Busy hour: A one hour period within a specified interval of time 380 (typically 24 hours) in which the traffic load in a network or 381 sub-network is greatest. 383 Congestion: A state of a network resource in which the traffic 384 incident on the resource exceeds its output capacity over an 385 interval of time. 387 Congestion avoidance: An approach to congestion management that 388 attempts to obviate the occurrence of congestion. 390 Congestion control: An approach to congestion management that 391 attempts to remedy congestion problems that have already occurred. 393 Constraint-based routing: A class of routing protocols that take 394 specified traffic attributes, network constraints, and policy 395 constraints into account when making routing decisions. 396 Constraint-based routing is applicable to traffic aggregates as 397 well as flows. It is a generalization of QoS routing. 399 Demand side congestion management: A congestion management scheme 400 that addresses congestion problems by regulating or conditioning 401 offered load. 403 Effective bandwidth: The minimum amount of bandwidth that can be 404 assigned to a flow or traffic aggregate in order to deliver 405 'acceptable service quality' to the flow or traffic aggregate. 407 Hot-spot: A network element or subsystem which is in a state of 408 congestion. 410 Inter-domain traffic: Traffic that originates in one Autonomous 411 system and terminates in another. 413 Metric: A parameter defined in terms of standard units of 414 measurement. 416 Measurement methodology: A repeatable measurement technique used to 417 derive one or more metrics of interest. 419 Network survivability: The capability to provide a prescribed level 420 of QoS for existing services after a given number of failures 421 occur within the network. 423 Offline traffic engineering: A traffic engineering system that 424 exists outside of the network. 426 Online traffic engineering: A traffic engineering system that exists 427 within the network, typically implemented on or as adjuncts to 428 operational network elements. 430 Performance measures: Metrics that provide quantitative or 431 qualitative measures of the performance of systems or subsystems 432 of interest. 434 Performance metric: A performance parameter defined in terms of 435 standard units of measurement. 437 Provisioning: The process of assigning or configuring network 438 resources to meet certain requests. 440 QoS routing: Class of routing systems that selects paths to be used 441 by a flow based on the QoS requirements of the flow. 443 Service Level Agreement (SLA): A contract between a provider and a 444 customer that guarantees specific levels of performance and 445 reliability at a certain cost. 447 Service Level Objective (SLO): A key element of an SLA between a 448 provider and a customer. SLOs are agreed upon as a means of 449 measuring the performance of the Service Provider and are outlined 450 as a way of avoiding disputes between the two parties based on 451 misunderstanding. 453 Stability: An operational state in which a network does not 454 oscillate in a disruptive manner from one mode to another mode. 456 Supply-side congestion management: A congestion management scheme 457 that provisions additional network resources to address existing 458 and/or anticipated congestion problems. 460 Traffic characteristic: A description of the temporal behavior or a 461 description of the attributes of a given traffic flow or traffic 462 aggregate. 464 Traffic engineering system: A collection of objects, mechanisms, and 465 protocols that are used conjunctively to accomplish traffic 466 engineering objectives. 468 Traffic flow: A stream of packets between two end-points that can be 469 characterized in a certain way. A micro-flow has a more specific 470 definition A micro-flow is a stream of packets with the same 471 source and destination addresses, source and destination ports, 472 and protocol ID. 474 Traffic matrix: A representation of the traffic demand between a set 475 of origin and destination abstract nodes. An abstract node can 476 consist of one or more network elements. 478 Traffic monitoring: The process of observing traffic characteristics 479 at a given point in a network and collecting the traffic 480 information for analysis and further action. 482 Traffic trunk: An aggregation of traffic flows belonging to the same 483 class which are forwarded through a common path. A traffic trunk 484 may be characterized by an ingress and egress node, and a set of 485 attributes which determine its behavioral characteristics and 486 requirements from the network. 488 2. Background 490 The Internet must convey IP packets from ingress nodes to egress 491 nodes efficiently, expeditiously, and economically. Furthermore, in 492 a multiclass service environment (e.g., Diffserv capable networks - 493 see Section 4.1.4), the resource sharing parameters of the network 494 must be appropriately determined and configured according to 495 prevailing policies and service models to resolve resource contention 496 issues arising from mutual interference between packets traversing 497 through the network. Thus, consideration must be given to resolving 498 competition for network resources between traffic streams belonging 499 to the same service class (intra-class contention resolution) and 500 traffic streams belonging to different classes (inter-class 501 contention resolution). 503 2.1. Context of Internet Traffic Engineering 505 The context of Internet traffic engineering includes: 507 1. A network domain context that defines the scope under 508 consideration, and in particular the situations in which the 509 traffic engineering problems occur. The network domain context 510 includes network structure, network policies, network 511 characteristics, network constraints, network quality attributes, 512 and network optimization criteria. 514 2. A problem context defining the general and concrete issues that 515 traffic engineering addresses. The problem context includes 516 identification, abstraction of relevant features, representation, 517 formulation, specification of the requirements on the solution 518 space, and specification of the desirable features of acceptable 519 solutions. 521 3. A solution context suggesting how to address the issues 522 identified by the problem context. The solution context includes 523 analysis, evaluation of alternatives, prescription, and 524 resolution. 526 4. An implementation and operational context in which the solutions 527 are instantiated. The implementation and operational context 528 includes planning, organization, and execution. 530 The context of Internet traffic engineering and the different problem 531 scenarios are discussed in the following subsections. 533 2.2. Network Domain Context 535 IP networks range in size from small clusters of routers situated 536 within a given location, to thousands of interconnected routers, 537 switches, and other components distributed all over the world. 539 At the most basic level of abstraction, an IP network can be 540 represented as a distributed dynamic system consisting of: 542 o a set of interconnected resources which provide transport services 543 for IP traffic subject to certain constraints 545 o a demand system representing the offered load to be transported 546 through the network 548 o a response system consisting of network processes, protocols, and 549 related mechanisms which facilitate the movement of traffic 550 through the network (see also [AWD2]). 552 The network elements and resources may have specific characteristics 553 restricting the manner in which the traffic demand is handled. 554 Additionally, network resources may be equipped with traffic control 555 mechanisms managing the way in which the demand is serviced. Traffic 556 control mechanisms may be used to: 558 o control packet processing activities within a given resource 560 o arbitrate contention for access to the resource by different 561 packets 563 o regulate traffic behavior through the resource. 565 A configuration management and provisioning system may allow the 566 settings of the traffic control mechanisms to be manipulated by 567 external or internal entities in order to exercise control over the 568 way in which the network elements respond to internal and external 569 stimuli. 571 The details of how the network carries packets are specified in the 572 policies of the network administrators and are installed through 573 network configuration management and policy based provisioning 574 systems. Generally, the types of service provided by the network 575 also depend upon the technology and characteristics of the network 576 elements and protocols, the prevailing service and utility models, 577 and the ability of the network administrators to translate policies 578 into network configurations. 580 Internet networks have three significant characteristics: 582 o they provide real-time services 584 o they are mission critical 586 o their operating environments are very dynamic. 588 The dynamic characteristics of IP and IP/MPLS networks can be 589 attributed in part to fluctuations in demand, to the interaction 590 between various network protocols and processes, to the rapid 591 evolution of the infrastructure which demands the constant inclusion 592 of new technologies and new network elements, and to transient and 593 persistent faults which occur within the system. 595 Packets contend for the use of network resources as they are conveyed 596 through the network. A network resource is considered to be 597 congested if, for an interval of time, the arrival rate of packets 598 exceed the output capacity of the resource. Congestion may result in 599 some of the arriving packets being delayed or even dropped. 601 Congestion increases transit delay, delay variation, may lead to 602 packet loss, and reduces the predictability of network services. 603 Clearly, congestion is highly undesirable. Combating congestion at a 604 reasonable cost is a major objective of Internet traffic engineering. 606 Efficient sharing of network resources by multiple traffic streams is 607 a basic operational premise for the Internet. A fundamental 608 challenge in network operation is to increase resource utilization 609 while minimizing the possibility of congestion. 611 The Internet has to function in the presence of different classes of 612 traffic with different service requirements. RFC 2475 provides an 613 architecture for Differentiated Services (DiffServ) and makes this 614 requirement clear [RFC2475]. The RFC allows packets to be grouped 615 into behavior aggregates such that each aggregate has a common set of 616 behavioral characteristics or a common set of delivery requirements. 617 Delivery requirements of a specific set of packets may be specified 618 explicitly or implicitly. Two of the most important traffic delivery 619 requirements are capacity constraints and QoS constraints. 621 Capacity constraints can be expressed statistically as peak rates, 622 mean rates, burst sizes, or as some deterministic notion of effective 623 bandwidth. QoS requirements can be expressed in terms of: 625 o integrity constraints such as packet loss 627 o temporal constraints such as timing restrictions for the delivery 628 of each packet (delay) and timing restrictions for the delivery of 629 consecutive packets belonging to the same traffic stream (delay 630 variation). 632 2.3. Problem Context 634 There are several large problems associated with operating a network 635 described in the previous section. This section analyzes the problem 636 context in relation to traffic engineering. The identification, 637 abstraction, representation, and measurement of network features 638 relevant to traffic engineering are significant issues. 640 A particular challenge is to formulate the problems that traffic 641 engineering attempts to solve. For example: 643 o how to identify the requirements on the solution space 645 o how to specify the desirable features of solutions 647 o how to actually solve the problems 649 o how to measure and characterize the effectiveness of solutions. 651 Another class of problems is how to measure and estimate relevant 652 network state parameters. Effective traffic engineering relies on a 653 good estimate of the offered traffic load as well as a view of the 654 underlying topology and associated resource constraints. A network- 655 wide view of the topology is also a must for offline planning. 657 Still another class of problem is how to characterize the state of 658 the network and how to evaluate its performance. The performance 659 evaluation problem is two-fold: one aspect relates to the evaluation 660 of the system-level performance of the network; the other aspect 661 relates to the evaluation of resource-level performance, which 662 restricts attention to the performance analysis of individual network 663 resources. 665 In this document, we refer to the system-level characteristics of the 666 network as the "macro-states" and the resource-level characteristics 667 as the "micro-states." The system-level characteristics are also 668 known as the emergent properties of the network. Correspondingly, we 669 refer to the traffic engineering schemes dealing with network 670 performance optimization at the systems level as "macro-TE" and the 671 schemes that optimize at the individual resource level as "micro-TE." 672 Under certain circumstances, the system-level performance can be 673 derived from the resource-level performance using appropriate rules 674 of composition, depending upon the particular performance measures of 675 interest. 677 Another fundamental class of problem concerns how to effectively 678 optimize network performance. Performance optimization may entail 679 translating solutions for specific traffic engineering problems into 680 network configurations. Optimization may also entail some degree of 681 resource management control, routing control, and capacity 682 augmentation. 684 2.3.1. Congestion and its Ramifications 686 Congestion is one of the most significant problems in an operational 687 IP context. A network element is said to be congested if it 688 experiences sustained overload over an interval of time. Congestion 689 almost always results in degradation of service quality to end users. 690 Congestion control schemes can include demand-side policies and 691 supply-side policies. Demand-side policies may restrict access to 692 congested resources or dynamically regulate the demand to alleviate 693 the overload situation. Supply-side policies may expand or augment 694 network capacity to better accommodate offered traffic. Supply-side 695 policies may also re-allocate network resources by redistributing 696 traffic over the infrastructure. Traffic redistribution and resource 697 re-allocation serve to increase the 'effective capacity' of the 698 network. 700 The emphasis of this document is primarily on congestion management 701 schemes falling within the scope of the network, rather than on 702 congestion management systems dependent upon sensitivity and 703 adaptivity from end-systems. That is, the aspects that are 704 considered in this document with respect to congestion management are 705 those solutions that can be provided by control entities operating on 706 the network and by the actions of network administrators and network 707 operations systems. 709 2.4. Solution Context 711 The solution context for Internet traffic engineering involves 712 analysis, evaluation of alternatives, and choice between alternative 713 courses of action. Generally the solution context is based on making 714 reasonable inferences about the current or future state of the 715 network, and making decisions that may involve a preference between 716 alternative sets of action. More specifically, the solution context 717 demands reasonable estimates of traffic workload, characterization of 718 network state, derivation of solutions which may be implicitly or 719 explicitly formulated, and possibly instantiating a set of control 720 actions. Control actions may involve the manipulation of parameters 721 associated with routing, control over tactical capacity acquisition, 722 and control over the traffic management functions. 724 The following list of instruments may be applicable to the solution 725 context of Internet traffic engineering. 727 o A set of policies, objectives, and requirements (which may be 728 context dependent) for network performance evaluation and 729 performance optimization. 731 o A collection of online and possibly offline tools and mechanisms 732 for measurement, characterization, modeling, and control traffic, 733 and control over the placement and allocation of network 734 resources, as well as control over the mapping or distribution of 735 traffic onto the infrastructure. 737 o A set of constraints on the operating environment, the network 738 protocols, and the traffic engineering system itself. 740 o A set of quantitative and qualitative techniques and methodologies 741 for abstracting, formulating, and solving traffic engineering 742 problems. 744 o A set of administrative control parameters which may be 745 manipulated through a Configuration Management (CM) system. The 746 CM system itself may include a configuration control subsystem, a 747 configuration repository, a configuration accounting subsystem, 748 and a configuration auditing subsystem. 750 o A set of guidelines for network performance evaluation, 751 performance optimization, and performance improvement. 753 Determining traffic characteristics through measurement or estimation 754 is very useful within the realm the traffic engineering solution 755 space. Traffic estimates can be derived from customer subscription 756 information, traffic projections, traffic models, and from actual 757 measurements. The measurements may be performed at different levels, 758 e.g., at the traffic-aggregate level or at the flow level. 759 Measurements at the flow level or on small traffic aggregates may be 760 performed at edge nodes, when traffic enters and leaves the network. 761 Measurements for large traffic-aggregates may be performed within the 762 core of the network. 764 To conduct performance studies and to support planning of existing 765 and future networks, a routing analysis may be performed to determine 766 the paths the routing protocols will choose for various traffic 767 demands, and to ascertain the utilization of network resources as 768 traffic is routed through the network. Routing analysis captures the 769 selection of paths through the network, the assignment of traffic 770 across multiple feasible routes, and the multiplexing of IP traffic 771 over traffic trunks (if such constructs exist) and over the 772 underlying network infrastructure. A model of network topology is 773 necessary to perform routing analysis. A network topology model may 774 be extracted from: 776 o network architecture documents 778 o network designs 780 o information contained in router configuration files 782 o routing databases 784 o routing tables 786 o automated tools that discover and collate network topology 787 information. 789 Topology information may also be derived from servers that monitor 790 network state, and from servers that perform provisioning functions. 792 Routing in operational IP networks can be administratively controlled 793 at various levels of abstraction including the manipulation of BGP 794 attributes and IGP metrics. For path oriented technologies such as 795 MPLS, routing can be further controlled by the manipulation of 796 relevant traffic engineering parameters, resource parameters, and 797 administrative policy constraints. Within the context of MPLS, the 798 path of an explicitly routed label switched path (LSP) can be 799 computed and established in various ways including: 801 o manually 803 o automatically, online using constraint-based routing processes 804 implemented on label switching routers 806 o automatically, offline using constraint-based routing entities 807 implemented on external traffic engineering support systems. 809 2.4.1. Combating the Congestion Problem 811 Minimizing congestion is a significant aspect of Internet traffic 812 engineering. This subsection gives an overview of the general 813 approaches that have been used or proposed to combat congestion. 815 Congestion management policies can be categorized based upon the 816 following criteria (see [YARE95] for a more detailed taxonomy of 817 congestion control schemes): 819 1. Congestion Management based on Response Time Scales 821 * Long (weeks to months): Expanding network capacity by adding 822 new equipement, routers, and links takes time and is 823 comparatively costly. Capacity planning needs to take this 824 into consideration. Network capacity is expanded based on 825 estimates or forcasts of future traffic development and 826 traffic distribution. These upgrades are typically carried 827 out over weeks or months, or maybe even years. 829 * Medium (minutes to days): Several control policies fall within 830 the medium timescale category. Examples include: 832 a. Adjusting routing protocol parameters to route traffic 833 away or towards certain segments of the network. 835 b. Setting up or adjusting explicitly routed LSPs in MPLS 836 networks to route traffic trunks away from possibly 837 congested resources or toward possibly more favorable 838 routes. 840 c. Re-configuring the logical topology of the network to make 841 it correlate more closely with the spatial traffic 842 distribution using, for example, an underlying path- 843 oriented technology such as MPLS LSPs or optical channel 844 trails. 846 Many of these adaptive schemes rely on measurement systems. A 847 measurement system monitors changes in traffic distribution, 848 traffic loads, and network resource utilization and then 849 provides feedback to the online or offline traffic engineering 850 mechanisms and tools so that they can trigger control actions 851 within the network. The traffic engineering mechanisms and 852 tools can be implemented in a distributed or centralized 853 fashion. A centralized scheme may have global visibility into 854 the network state and may produce more optimal solutions. 855 However, centralized schemes are prone to single points of 856 failure and may not scale as well as distributed schemes. 857 Moreover, the information utilized by a centralized scheme may 858 be stale and might not reflect the actual state of the 859 network. It is not an objective of this document to make a 860 recommendation between distributed and centralized schemes: 861 that is a choice that network administrators must make based 862 on their specific needs. 864 * Short (picoseconds to minutes): This category includes packet 865 level processing functions and events that are recorded on the 866 order of several round trip times. It also includes router 867 mechanisms such as passive and active buffer management. All 868 of these mechanisms are used to control congestion or signal 869 congestion to end systems so that they can adaptively regulate 870 the rate at which traffic is injected into the network. One 871 of the most popular active queue management schemes, 872 especially for TCP traffic, is Random Early Detection (RED) 873 [FLJA93]. During congestion (but before the queue is filled), 874 the RED scheme chooses arriving packets to "mark" according to 875 a probabilistic algorithm which takes into account the average 876 queue size. A router that does not utilize explicit 877 congestion notification (ECN) [FLOY94] can simply drop marked 878 packets to aleviate congestion and implicitly notify the 879 receiver about the congestion. On the other hand, if the 880 router supports ECN, it can set the ECN field in the packet 881 header. Several variations of RED have been proposed to 882 support different drop precedence levels in multi-class 883 environments [RFC2597]. RED provides congestion avoidance 884 which is not worse than traditional Tail-Drop (TD) queue 885 management (drop arriving packets only when the queue is 886 full). Importantly, RED reduces the possibility of global 887 synchronization where retransmission burst become synchronized 888 across the whole network, and improves fairness among 889 different TCP sessions. However, RED by itself cannot prevent 890 congestion and unfairness caused by sources unresponsive to 891 RED, e.g., UDP traffic and some misbehaved greedy connections. 892 Other schemes have been proposed to improve the performance 893 and fairness in the presence of unresponsive traffic. Some of 894 those schemes (such as Longest Queue Drop (LQD) and Dynamic 895 Soft Partitioning with Random Drop (RND) [SLDC98]) were 896 proposed as theoretical frameworks and are typically not 897 available in existing commercial products. 899 2. Congestion Management: Reactive Versus Preventive Schemes 901 * Reactive: Reactive (recovery) congestion management policies 902 react to existing congestion problems. All the policies 903 described above for the long and medium time scales can be 904 categorized as being reactive. They are based on monitoring 905 and identifying congestion problems that exist in the network, 906 and on the initiation of relevant actions to ease a situation. 908 * Preventive: Preventive (predictive/avoidance) policies take 909 proactive action to prevent congestion based on estimates and 910 predictions of future congestion problems. Some of the 911 policies described for the long and medium time scales fall 912 into this category. Preventive policies do not necessarily 913 respond immediately to existing congestion problems. Instead, 914 forecasts of traffic demand and workload distribution are 915 considered, and action may be taken to prevent potential 916 future congestion problems. The schemes described for the 917 short time scale can also be used for congestion avoidance 918 because dropping or marking packets before queues actually 919 overflow would trigger corresponding TCP sources to slow down. 921 3. Congestion Management: Supply-Side Versus Demand-Side Schemes 923 * Supply-side: Supply-side congestion management policies 924 increase the effective capacity available to traffic in order 925 to control or reduce congestion. This can be accomplished by 926 increasing capacity or by balancing distribution of traffic 927 over the network. Capacity planning aims to provide a 928 physical topology and associated link bandwidths that match or 929 exceed estimated traffic workload and traffic distribution 930 subject to traffic forecasts and budgetary or other 931 constraints. If the actual traffic distribution does not fit 932 the topology derived from capacity panning, then the traffic 933 can be mapped onto the topology by using routing control 934 mechanisms, by applying path oriented technologies (e.g., MPLS 935 LSPs and optical channel trails) to modify the logical 936 topology, or by employing some other load redistribution 937 mechanisms. 939 * Demand-side: Demand-side congestion management policies 940 control or regulate the offered traffic to alleviate 941 congestion problems. For example, some of the short time 942 scale mechanisms described earlier as well as policing and 943 rate-shaping mechanisms attempt to regulate the offered load 944 in various ways. 946 2.5. Implementation and Operational Context 948 The operational context of Internet traffic engineering is 949 characterized by constant changes that occur at multiple levels of 950 abstraction. The implementation context demands effective planning, 951 organization, and execution. The planning aspects may involve 952 determining prior sets of actions to achieve desired objectives. 953 Organizing involves arranging and assigning responsibility to the 954 various components of the traffic engineering system and coordinating 955 the activities to accomplish the desired TE objectives. Execution 956 involves measuring and applying corrective or perfective actions to 957 attain and maintain desired TE goals. 959 3. Traffic Engineering Process Models 961 This section describes a generic process model that captures the 962 high-level practical aspects of Internet traffic engineering in an 963 operational context. The process model is described as a sequence of 964 actions that must be carried out to optimize the performance of an 965 operational network (see also [RFC2702], [AWD2]). This process model 966 may be enacted explicitly or implicitly, by a software process or by 967 a human. 969 The traffic engineering process model is iterative [AWD2]. The four 970 phases of the process model described below are repeated as a 971 continual sequence. 973 o Define the relevant control policies that govern the operation of 974 the network. 976 o Acquire measurement data from the operational network. 978 o Analyze the network state and characterize the traffic workload. 979 Proactive analysis identifies potential problems that could 980 manifest in the future. Reactive analysis identifies existing 981 problems and determines their causes. 983 o Optimizate the performance of the network. This involves a 984 decision process which selects and implements a set of actions 985 from a set of alternatives given the results of the three previous 986 steps. Optimization actions may include the use of techniques to 987 control the offered traffic and to control the distribution of 988 traffic across the network. 990 3.1. Components of the Traffic Engineering Process Model 992 The key components of the traffic engineering process model are as 993 follows. 995 1. Measurement is crucial to the traffic engineering function. The 996 operational state of a network can only be conclusively 997 determined through measurement. Measurement is also critical to 998 the optimization function because it provides feedback data which 999 is used by traffic engineering control subsystems. This data is 1000 used to adaptively optimize network performance in response to 1001 events and stimuli originating within and outside the network. 1002 Measurement in support of the TE function can occur at different 1003 levels of abstraction. For example, measurement can be used to 1004 derive packet level characteristics, flow level characteristics, 1005 user or customer level characteristics, traffic aggregate 1006 characteristics, component level characteristics, and network 1007 wide characteristics. 1009 2. Modeling, analysis, and simulation are important aspects of 1010 Internet traffic engineering. Modeling involves constructing an 1011 abstract or physical representation which depicts relevant 1012 traffic characteristics and network attributes. A network model 1013 is an abstract representation of the network which captures 1014 relevant network features, attributes, and characteristic. 1015 Network simulation tools are extremely useful for traffic 1016 engineering. Because of the complexity of realistic quantitative 1017 analysis of network behavior, certain aspects of network 1018 performance studies can only be conducted effectively using 1019 simulation. 1021 3. Network performance optimization involves resolving network 1022 issues by transforming such issues into concepts that enable a 1023 solution, identification of a solution, and implementation of the 1024 solution. Network performance optimization can be corrective or 1025 perfective. In corrective optimization, the goal is to remedy a 1026 problem that has occurred or that is incipient. In perfective 1027 optimization, the goal is to improve network performance even 1028 when explicit problems do not exist and are not anticipated. 1030 4. Review of TE Techniques 1032 This section briefly reviews different traffic engineering approaches 1033 proposed and implemented in telecommunications and computer networks 1034 using IETF protocols and architectures. The discussion is not 1035 intended to be comprehensive. It is primarily intended to illuminate 1036 existing approaches to traffic engineering in the Internet. A 1037 historic overview of traffic engineering in telecommunications 1038 networks is provided in Appendix A, while Appendix B describes 1039 approaches in other standards bodies. 1041 4.1. Overview of IETF Projects Related to Traffic Engineering 1043 This subsection reviews a number of IETF activities pertinent to 1044 Internet traffic engineering. 1046 4.1.1. Constraint-Based Routing 1048 Constraint-based routing refers to a class of routing systems that 1049 compute routes through a network subject to the satisfaction of a set 1050 of constraints and requirements. In the most general case, 1051 constraint-based routing may also seek to optimize overall network 1052 performance while minimizing costs. 1054 The constraints and requirements may be imposed by the network itself 1055 or by administrative policies. Constraints may include bandwidth, 1056 hop count, delay, and policy instruments such as resource class 1057 attributes. Constraints may also include domain specific attributes 1058 of certain network technologies and contexts which impose 1059 restrictions on the solution space of the routing function. Path 1060 oriented technologies such as MPLS have made constraint-based routing 1061 feasible and attractive in public IP networks. 1063 The concept of constraint-based routing within the context of MPLS 1064 traffic engineering requirements in IP networks was first described 1065 in [RFC2702] and led to developments such as MPLS-TE [RFC3209] as 1066 described in Section 4.1.6. 1068 Unlike QoS routing (for example, see [RFC2386] and [MA]) which 1069 generally addresses the issue of routing individual traffic flows to 1070 satisfy prescribed flow-based QoS requirements subject to network 1071 resource availability, constraint-based routing is applicable to 1072 traffic aggregates as well as flows and may be subject to a wide 1073 variety of constraints which may include policy restrictions. 1075 4.1.2. Integrated Services 1077 The IETF developed the Integrated Services (Intserv) model that 1078 requires resources, such as bandwidth and buffers, to be reserved a 1079 priori for a given traffic flow to ensure that the quality of service 1080 requested by the traffic flow is satisfied. The Integrated Services 1081 model includes additional components beyond those used in the best- 1082 effort model such as packet classifiers, packet schedulers, and 1083 admission control. A packet classifier is used to identify flows 1084 that are to receive a certain level of service. A packet scheduler 1085 handles the scheduling of service to different packet flows to ensure 1086 that QoS commitments are met. Admission control is used to determine 1087 whether a router has the necessary resources to accept a new flow. 1089 The main issue with the Integrated Services model has been 1090 scalability [RFC2998], especially in large public IP networks which 1091 may potentially have millions of active micro-flows in transit 1092 concurrently. 1094 A notable feature of the Integrated Services model is that it 1095 requires explicit signaling of QoS requirements from end systems to 1096 routers [RFC2753]. The Resource Reservation Protocol (RSVP) performs 1097 this signaling function and is a critical component of the Integrated 1098 Services model. RSVP is described in Section 4.1.3. 1100 4.1.3. RSVP 1102 RSVP is a soft state signaling protocol [RFC2205]. It supports 1103 receiver initiated establishment of resource reservations for both 1104 multicast and unicast flows. RSVP was originally developed as a 1105 signaling protocol within the Integrated Services framework (see 1106 Section 4.1.2) for applications to communicate QoS requirements to 1107 the network and for the network to reserve relevant resources to 1108 satisfy the QoS requirements [RFC2205]. 1110 In RSVP, the traffic sender or source node sends a PATH message to 1111 the traffic receiver with the same source and destination addresses 1112 as the traffic which the sender will generate. The PATH message 1113 contains: (1) a sender traffic specification describing the 1114 characteristics of the traffic, (2) a sender template specifying the 1115 format of the traffic, and (3) an optional advertisement 1116 specification which is used to support the concept of One Pass With 1117 Advertising (OPWA) [RFC2205]. Every intermediate router along the 1118 path forwards the PATH message to the next hop determined by the 1119 routing protocol. Upon receiving a PATH message, the receiver 1120 responds with a RESV message which includes a flow descriptor used to 1121 request resource reservations. The RESV message travels to the 1122 sender or source node in the opposite direction along the path that 1123 the PATH message traversed. Every intermediate router along the path 1124 can reject or accept the reservation request of the RESV message. If 1125 the request is rejected, the rejecting router will send an error 1126 message to the receiver and the signaling process will terminate. If 1127 the request is accepted, link bandwidth and buffer space are 1128 allocated for the flow and the related flow state information is 1129 installed in the router. 1131 One of the issues with the original RSVP specification was 1132 Scalability. This is because reservations were required for micro- 1133 flows, so that the amount of state maintained by network elements 1134 tends to increase linearly with the number of micro-flows. These 1135 issues are described in [RFC2961] which also modifies and extendeds 1136 RSVP to mitigate the scaling problems to make RSVP a versatile 1137 signaling protocol for the Internet. For example, RSVP has been 1138 extended to reserve resources for aggregation of flows, to set up 1139 MPLS explicit label switched paths (see Section 4.1.6), and to 1140 perform other signaling functions within the Internet. [RFC2961] 1141 also describes a mechanism to reduce the amount of Refresh messages 1142 required to maintain established RSVP sessions. 1144 4.1.4. Differentiated Services 1146 The goal of Differentiated Services (Diffserv) within the IETF was to 1147 devise scalable mechanisms for categorization of traffic into 1148 behavior aggregates, which ultimately allows each behavior aggregate 1149 to be treated differently, especially when there is a shortage of 1150 resources such as link bandwidth and buffer space [RFC2475]. One of 1151 the primary motivations for Diffserv was to devise alternative 1152 mechanisms for service differentiation in the Internet that mitigate 1153 the scalability issues encountered with the Intserv model. 1155 Diffserv uses the Differentiated Services field in the IP header (the 1156 DS field) consisting of six bits in what was formerly known as the 1157 Type of Service (TOS) octet. The DS field is used to indicate the 1158 forwarding treatment that a packet should receive at a transit node 1159 [RFC2474]. Diffserv includes the concept of Per-Hop Behavior (PHB) 1160 groups. Using the PHBs, several classes of services can be defined 1161 using different classification, policing, shaping, and scheduling 1162 rules. 1164 For an end-user of network services to utilize Differentiated 1165 Services provided by its Internet Service Provider (ISP), it may be 1166 necessary for the user to have an SLA with the ISP. An SLA may 1167 explicitly or implicitly specify a Traffic Conditioning Agreement 1168 (TCA) which defines classifier rules as well as metering, marking, 1169 discarding, and shaping rules. 1171 Packets are classified, and possibly policed and shaped at the 1172 ingress to a Diffserv network. When a packet traverses the boundary 1173 between different Diffserv domains, the DS field of the packet may be 1174 re-marked according to existing agreements between the domains. 1176 Differentiated Services allows only a finite number of service 1177 classes to be specified by the DS field. The main advantage of the 1178 Diffserv approach relative to the Intserv model is scalability. 1179 Resources are allocated on a per-class basis and the amount of state 1180 information is proportional to the number of classes rather than to 1181 the number of application flows. 1183 The Diffserv model deals with traffic management issues on a per hop 1184 basis. The Diffserv control model consists of a collection of micro- 1185 TE control mechanisms. Other traffic engineering capabilities, such 1186 as capacity management (including routing control), are also required 1187 in order to deliver acceptable service quality in Diffserv networks. 1188 The concept of Per Domain Behaviors has been introduced to better 1189 capture the notion of Differentiated Services across a complete 1190 domain [RFC3086]. 1192 4.1.5. QUIC 1194 TBD 1196 4.1.6. Multiprotocol Label Switching (MPLS) 1198 MPLS is an advanced forwarding scheme which also includes extensions 1199 to conventional IP control plane protocols. MPLS extends the 1200 Internet routing model and enhances packet forwarding and path 1201 control [RFC3031]. 1203 At the ingress to an MPLS domain, Label Switching Routers (LSRs) 1204 classify IP packets into Forwarding Equivalence Classes (FECs) based 1205 on a variety of factors, including, e.g., a combination of the 1206 information carried in the IP header of the packets and the local 1207 routing information maintained by the LSRs. An MPLS label stack 1208 entry is then prepended to each packet according to their forwarding 1209 equivalence classes. The MPLS label stack entry is 32 bits long and 1210 contains a 20-bit label field. 1212 An LSR makes forwarding decisions by using the label prepended to 1213 packets as the index into a local next hop label forwarding entry 1214 (NHLFE). The packet is then processed as specified in the NHLFE. 1215 The incoming label may be replaced by an outgoing label (label swap), 1216 and the packet may be forwarded to the next LSR. Before a packet 1217 leaves an MPLS domain, its MPLS label may be removed (label pop). A 1218 Label Switched Path (LSP) is the path between an ingress LSRs and an 1219 egress LSRs through which a labeled packet traverses. The path of an 1220 explicit LSP is defined at the originating (ingress) node of the LSP. 1221 MPLS can use a signaling protocol such as RSVP or LDP to set up LSPs. 1223 MPLS is a very powerful technology for Internet traffic engineering 1224 because it supports explicit LSPs which allow constraint-based 1225 routing to be implemented efficiently in IP networks [AWD2]. The 1226 requirements for traffic engineering over MPLS are described in 1227 [RFC2702]. Extensions to RSVP to support instantiation of explicit 1228 LSP are discussed in [RFC3209]. 1230 4.1.7. Generalized MPLS 1232 GMPLS extends MPLS control protocols to encompass time-division 1233 (e.g., SONET/SDH, PDH, G.709), wavelength (lambdas), and spatial 1234 switching (e.g., incoming port or fiber to outgoing port or fiber) as 1235 well as continuing to support packet switching. GMPLS provides a 1236 common set of control protocols for all of these layers (including 1237 some technology-specific extensions) each of which has a diverse data 1238 or forwarding plane. GMPLS covers both the signaling and the routing 1239 part of that control plane and is based on the Traffic Engineering 1240 extensions to MPLS (see Section 4.1.6). 1242 In GMPLS, the original MPLS architecture is extended to include LSRs 1243 whose forwarding planes rely on circuit switching, and therefore 1244 cannot forward data based on the information carried in either packet 1245 or cell headers. Specifically, such LSRs include devices where the 1246 switching is based on time slots, wavelengths, or physical ports. 1247 These additions impact basic LSP properties: how labels are requested 1248 and communicated, the unidirectional nature of MPLS LSPs, how errors 1249 are propagated, and information provided for synchronizing the 1250 ingress and egress LSRs. 1252 4.1.8. IP Performance Metrics 1254 The IETF IP Performance Metrics (IPPM) working group has developed a 1255 set of standard metrics that can be used to monitor the quality, 1256 performance, and reliability of Internet services. These metrics can 1257 be applied by network operators, end-users, and independent testing 1258 groups to provide users and service providers with a common 1259 understanding of the performance and reliability of the Internet 1260 component 'clouds' they use/provide [RFC2330]. The criteria for 1261 performance metrics developed by the IPPM working group are described 1262 in [RFC2330]. Examples of performance metrics include one-way packet 1263 loss [RFC7680], one-way delay [RFC7679], and connectivity measures 1264 between two nodes [RFC2678]. Other metrics include second-order 1265 measures of packet loss and delay. 1267 Some of the performance metrics specified by the IPPM working group 1268 are useful for specifying SLAs. SLAs are sets of service level 1269 objectives negotiated between users and service providers, wherein 1270 each objective is a combination of one or more performance metrics, 1271 possibly subject to certain constraints. 1273 4.1.9. Flow Measurement 1275 The IETF Real Time Flow Measurement (RTFM) working group produced an 1276 architecture that defines a method to specify traffic flows as well 1277 as a number of components for flow measurement (meters, meter 1278 readers, manager) [RFC2722]. A flow measurement system enables 1279 network traffic flows to be measured and analyzed at the flow level 1280 for a variety of purposes. As noted in RFC 2722, a flow measurement 1281 system can be very useful in the following contexts: 1283 o understanding the behavior of existing networks 1285 o planning for network development and expansion 1286 o quantification of network performance 1288 o verifying the quality of network service 1290 o attribution of network usage to users. 1292 A flow measurement system consists of meters, meter readers, and 1293 managers. A meter observes packets passing through a measurement 1294 point, classifies them into groups, accumulates usage data (such as 1295 the number of packets and bytes for each group), and stores the usage 1296 data in a flow table. A group may represent any collection of user 1297 applications, hosts, networks, etc. A meter reader gathers usage 1298 data from various meters so it can be made available for analysis. A 1299 manager is responsible for configuring and controlling meters and 1300 meter readers. The instructions received by a meter from a manager 1301 include flow specifications, meter control parameters, and sampling 1302 techniques. The instructions received by a meter reader from a 1303 manager include the address of the meter whose date is to be 1304 collected, the frequency of data collection, and the types of flows 1305 to be collected. 1307 4.1.10. Endpoint Congestion Management 1309 [RFC3124] provides a set of congestion control mechanisms for the use 1310 of transport protocols. It is also allows the development of 1311 mechanisms for unifying congestion control across a subset of an 1312 endpoint's active unicast connections (called a congestion group). A 1313 congestion manager continuously monitors the state of the path for 1314 each congestion group under its control. The manager uses that 1315 information to instruct a scheduler on how to partition bandwidth 1316 among the connections of that congestion group. 1318 4.1.11. TE Extensions to the IGPs 1320 TBD 1322 4.1.12. Link-State BGP 1324 In a number of environments, a component external to a network is 1325 called upon to perform computations based on the network topology and 1326 current state of the connections within the network, including 1327 traffic engineering information. This is information typically 1328 distributed by IGP routing protocols within the network (see 1329 Section 4.1.11. 1331 The Border Gateway Protocol (BGP) Section 7 is one of the essential 1332 routing protocols that glue the Internet together. BGP Link State 1333 (BGP-LS) [RFC7752] is a mechanism by which link-state and traffic 1334 engineering information can be collected from networks and shared 1335 with external components using the BGP routing protocol. The 1336 mechanism is applicable to physical and virtual IGP links, and is 1337 subject to policy control. 1339 Information collected by BGP-LS can be used to construct the Traffic 1340 Engineering Database (TED, see Section 4.1.19) for use by the Path 1341 Computation Element (PCE, see Section 4.1.13), or may be used by 1342 Application-Layer Traffic Optimization (ALTO) servers (see 1343 Section 4.1.14). 1345 4.1.13. Path Computation Element 1347 Constraint-based path computation is a fundamental building block for 1348 traffic engineering in MPLS and GMPLS networks. Path computation in 1349 large, multi-domain networks is complex and may require special 1350 computational components and cooperation between the elements in 1351 different domains. The Path Computation Element (PCE) [RFC4655] is 1352 an entity (component, application, or network node) that is capable 1353 of computing a network path or route based on a network graph and 1354 applying computational constraints. 1356 Thus, a PCE can provide a central component in a traffic engineering 1357 system operating on the Traffic Engineering Database (TED, see 1358 Section 4.1.19) with delegated responsibility for determining paths 1359 in MPLS, GMPLS, or Segment Routing networks. The PCE uses the Path 1360 Computation Element Communication Protocol (PCEP) [RFC5440] to 1361 communicate with Path Computation Clients (PCCs), such as MPLS LSRs, 1362 to answer their requests for computed paths or to instruct them to 1363 initiate new paths [RFC8281] and maintain state about paths already 1364 installed in the network [RFC8231]. 1366 PCEs form key components of a number of traffic engineering systems, 1367 such as the Application of the Path Computation Element Architecture 1368 [RFC6805], the Applicability of a Stateful Path Computation Element 1369 [RFC8051], Abstraction and Control of TE Networks (ACTN) 1370 Section 4.1.16, Centralized Network Control [RFC8283], and Software 1371 Defined Networking (SDN) Section 5.3.2. 1373 4.1.14. Application-Layer Traffic Optimization 1375 TBD 1377 4.1.15. Segment Routing with MPLS encapsuation (SR-MPLS) 1379 Segment Routing (SR) leverages the source routing and tunneling 1380 paradigms. The path a packet takes is defined at the ingress and the 1381 packet is tunneled to the egress. A node steers a packet through a 1382 controlled set of instructions, called segments, by prepending the 1383 packet with an SR header: a label stack in MPLS case. 1385 A segment can represent any instruction, topological or service- 1386 based, thanks to the MPLS architecture [RFC3031]. Labels can be 1387 looked up in a global context (platform wide) as well as in some 1388 other context (see "context labels" in Section 3 of [RFC5331]). 1390 4.1.15.1. Base Segment Routing Identifier Types 1392 Segments are identified by Segment Identifiers (SIDs). There are 1393 four types of SID that are relevant for traffic engineering. 1395 Prefix SID: Uses the SR Global Block (SRGB), must be unique within 1396 the routing domain SRGB, and is advertised by an IGP. The Prefix- 1397 SID can be configured as an absolute value or an index. 1399 Node SID: A Prefix SID with the 'N' (node) bit set. It is 1400 associated with a host prefix (/32 or /128) that identifies the 1401 node. More than 1 Node SID can be configured per node. 1403 Adjacency SID: Locally significant by default, an Adjacency SID can 1404 be made globally significant through use of the 'L' flag. It 1405 identifies a unidirectional adjacency. In most implementations 1406 Adjacency SIDs are automatically allocated for each adjacency. 1407 They are always encoded as an absolute (not indexed) value. 1409 Binding SID: A Binding SID has two purposes: 1411 1. Mapping Server in ISIS 1413 The SID/Label Binding TLV is used to advertise the mappings 1414 of prefixes to SIDs/Labels. This functionality is called 1415 the Segment Routing Mapping Server (SRMS). The behavior of 1416 the SRMS is defined in [RFC8661] 1418 2. Cross-connect (label to FEC mapping) 1420 This is fundamental for multi-domain/multi-layer operation. 1421 The Binding SID identifies a new path available at the 1422 anchor point. It is always local to the originator, must 1423 not be present at the top of the stack, and must be looked 1424 up in the context of the Node SID. It could be provisioned 1425 through Netconf/Restconf, PCEP, BGP, or the CLI. 1427 4.1.16. Network Virtualization and Abstraction 1429 One of the main drivers for Software Defined Networking (SDN) 1430 [RFC7149] is a decoupling of the network control plane from the data 1431 plane. This separation has been achieved for TE networks with the 1432 development of MPLS/GMPLS Section 4.1.6 Section 4.1.7 and the Path 1433 Computation Element (PCE) Section 4.1.13. One of the advantages of 1434 SDN is its logically centralized control regime that allows a global 1435 view of the underlying networks. Centralized control in SDN helps 1436 improve network resource utilization compared with distributed 1437 network control. 1439 Abstraction and Control of TE Networks (ACTN) [RFC8453] defines a 1440 hierarchical SDN architecture which describes the functional entities 1441 and methods for the coordination of resources across multiple 1442 domains, to provide end-to-end traffic engineered services. ACTN 1443 facilitates end-to-end connections and provides them to the user. 1444 ACTN is focused on: 1446 o Abstraction of the underlying network resources and how they are 1447 provided to higher-layer applications and customers. 1449 o Virtualization of underlying resources for use by the customer, 1450 application, or service. The creation of a virtualized 1451 environment allows operators to view and control multi-domain 1452 networks as a single virtualized network. 1454 o Presentation to customers of networks as a virtual network via 1455 open and programmable interfaces. 1457 The ACTN managed infrastructure is built from traffic engineered 1458 network resources, which may include statistical packet bandwidth, 1459 physical forwarding plane sources (such as wavelengths and time 1460 slots), forwarding and cross-connect capabilities. The type of 1461 network virtualization seen in ACTN allows customers and applications 1462 (tenants) to utilise and independently control allocated virtual 1463 network resources as if resources as if they were physically their 1464 own resource. The ACTN network is "sliced", with tenants being given 1465 a different partial and abstracted topology view of the physical 1466 underlying network. 1468 4.1.17. Network Slicing 1470 TBD 1472 4.1.18. Deterministic Networking 1474 TBD 1476 4.1.19. Network TE State Definition and Presentation 1478 The network states that are relevant to the traffic engineering need 1479 to be stored in the system and presented to the user. The Traffic 1480 Engineering Database (TED) is a collection of all TE information 1481 about all TE nodes and TE links in the network, which is an essential 1482 component of a TE system, such as MPLS-TE [RFC2702] and GMPLS 1483 [RFC3945]. In order to formally define the data in the TED and to 1484 present the data to the user with high usability, the data modeling 1485 language YANG [RFC7950] can be used as described in [RFC8795]. 1487 4.1.20. System Management and Control Interfaces 1489 The traffic engineering control system needs to have a management 1490 interface that is human-friendly and a control interfaces that is 1491 programable for automation. The Network Configuration Protocol 1492 (NETCONF) [RFC6241] or the RESTCONF Protocol [RFC8040] provide 1493 programmable interfaces that are also human-friendly. These 1494 protocols use XML or JSON encoded messages. When message compactness 1495 or protocol bandwidth consumption needs to be optimized for the 1496 control interface, other protocols, such as Group Communication for 1497 the Constrained Application Protocol (CoAP) [RFC7390] or gRPC, are 1498 available, especially when the protocol messages are encoded in a 1499 binary format. Along with any of these protocols, the data modeling 1500 language YANG [RFC7950] can be used to formally and precisely define 1501 the interface data. 1503 The Path Computation Element Communication Protocol (PCEP) [RFC5440] 1504 is another protocol that has evolved to be an option for the TE 1505 system control interface. The messages of PCEP are TLV-based, not 1506 defined by a data modeling language such as YANG. 1508 4.2. Content Distribution 1510 The Internet is dominated by client-server interactions, principally 1511 Web traffic although in the future, more sophisticated media servers 1512 may become dominant. The location and performance of major 1513 information servers has a significant impact on the traffic patterns 1514 within the Internet as well as on the perception of service quality 1515 by end users. 1517 A number of dynamic load balancing techniques have been devised to 1518 improve the performance of replicated information servers. These 1519 techniques can cause spatial traffic characteristics to become more 1520 dynamic in the Internet because information servers can be 1521 dynamically picked based upon the location of the clients, the 1522 location of the servers, the relative utilization of the servers, the 1523 relative performance of different networks, and the relative 1524 performance of different parts of a network. This process of 1525 assignment of distributed servers to clients is called traffic 1526 directing. It is an application layer function. 1528 Traffic directing schemes that allocate servers in multiple 1529 geographically dispersed locations to clients may require empirical 1530 network performance statistics to make more effective decisions. In 1531 the future, network measurement systems may need to provide this type 1532 of information. 1534 When congestion exists in the network, traffic directing and traffic 1535 engineering systems should act in a coordinated manner. This topic 1536 is for further study. 1538 The issues related to location and replication of information 1539 servers, particularly web servers, are important for Internet traffic 1540 engineering because these servers contribute a substantial proportion 1541 of Internet traffic. 1543 5. Taxonomy of Traffic Engineering Systems 1545 This section presents a short taxonomy of traffic engineering systems 1546 constructed based on traffic engineering styles and views as listed 1547 below and described in greater detail in the following subsections of 1548 this document. 1550 o Time-dependent versus State-dependent versus Event-dependent 1552 o Offline versus Online 1554 o Centralized versus Distributed 1556 o Local versus Global Information 1558 o Prescriptive versus Descriptive 1560 o Open Loop versus Closed Loop 1562 o Tactical versus Strategic 1564 5.1. Time-Dependent Versus State-Dependent Versus Event Dependent 1566 Traffic engineering methodologies can be classified as time- 1567 dependent, state-dependent, or event-dependent. All TE schemes are 1568 considered to be dynamic in this document. Static TE implies that no 1569 traffic engineering methodology or algorithm is being applied - it is 1570 a feature of network planning, but lacks the reactive and flexible 1571 nature of traffic engineering. 1573 In time-dependent TE, historical information based on periodic 1574 variations in traffic (such as time of day) is used to pre-program 1575 routing and other TE control mechanisms. Additionally, customer 1576 subscription or traffic projection may be used. Pre-programmed 1577 routing plans typically change on a relatively long time scale (e.g., 1578 daily). Time-dependent algorithms do not attempt to adapt to short- 1579 term variations in traffic or changing network conditions. An 1580 example of a time-dependent algorithm is a global centralized 1581 optimizer where the input to the system is a traffic matrix and 1582 multi-class QoS requirements as described [MR99]. Another example of 1583 such a methodology is the application of data mining to Internet 1584 traffic [AJ19] which enables the use of various machine learning 1585 algorithms to identify patterns within historically collected 1586 datasets about Internet traffic, and to extract information in order 1587 to guide decision-making, and to improve efficiency and productivity 1588 of operational processes. 1590 State-dependent TE adapts the routing plans based on the current 1591 state of the network which provides additional information on 1592 variations in actual traffic (i.e., perturbations from regular 1593 variations) that could not be predicted using historical information. 1594 Constraint-based routing is an example of state-dependent TE 1595 operating in a relatively long time scale. An example operating in a 1596 relatively short timescale is a load-balancing algorithm described in 1597 [MATE]. The state of the network can be based on parameters such as 1598 utilization, packet delay, and packet loss that can be advertized or 1599 flooded by the routers. Another approach is for a particular router 1600 performing adaptive TE to send probe packets along a path to gather 1601 the state of that path. [RFC6374] defines protocol extensions to 1602 collect performance measurements from MPLS networks. Another 1603 approach is for a management system to gather the relevant 1604 information directly from network elements using telemetry data 1605 collection "publication/subscription" techniques [RFC7923]. Timely 1606 gathering and distribution of state information is critical for 1607 adaptive TE. While time-dependent algorithms are suitable for 1608 predictable traffic variations, state-dependent algorithms may be 1609 applied to increase network efficiency and resilience to adapt to the 1610 prevailing network state. 1612 Event-dependent TE methods can also be used for TE path selection. 1613 Event-dependent TE methods are distinct from time-dependent and 1614 state-dependent TE methods in the manner in which paths are selected. 1615 These algorithms are adaptive and distributed in nature and typically 1616 use learning models to find good paths for TE in a network. While 1617 state-dependent TE models typically use available-link-bandwidth 1618 (ALB) flooding for TE path selection, event-dependent TE methods do 1619 not require ALB flooding. Rather, event-dependent TE methods 1620 typically search out capacity by learning models, as in the success- 1621 to-the-top (STT) method. ALB flooding can be resource intensive, 1622 since it requires link bandwidth to carry LSAs, processor capacity to 1623 process LSAs, and the overhead can limit area/Autonomous System (AS) 1624 size. Modeling results suggest that event-dependent TE methods could 1625 lead to a reduction in ALB flooding overhead without loss of network 1626 throughput performance [I-D.ietf-tewg-qos-routing]. 1628 5.2. Offline Versus Online 1630 Traffic engineering requires the computation of routing plans. The 1631 computation may be performed offline or online. The computation can 1632 be done offline for scenarios where routing plans need not be 1633 executed in real-time. For example, routing plans computed from 1634 forecast information may be computed offline. Typically, offline 1635 computation is also used to perform extensive searches on multi- 1636 dimensional solution spaces. 1638 Online computation is required when the routing plans must adapt to 1639 changing network conditions as in state-dependent algorithms. Unlike 1640 offline computation (which can be computationally demanding), online 1641 computation is geared toward relative simple and fast calculations to 1642 select routes, fine-tune the allocations of resources, and perform 1643 load balancing. 1645 5.3. Centralized Versus Distributed 1647 Centralized control has a central authority which determines routing 1648 plans and perhaps other TE control parameters on behalf of each 1649 router. The central authority collects the network-state information 1650 from all routers periodically and returns the routing information to 1651 the routers. The routing update cycle is a critical parameter 1652 directly impacting the performance of the network being controlled. 1653 Centralized control may need high processing power and high bandwidth 1654 control channels. 1656 Distributed control determines route selection by each router 1657 autonomously based on the routers view of the state of the network. 1658 The network state information may be obtained by the router using a 1659 probing method or distributed by other routers on a periodic basis 1660 using link state advertisements. Network state information may also 1661 be disseminated under exceptional conditions. Examples of protocol 1662 extensions used to advertise network link state information are 1663 defined in [RFC5305], [RFC6119], [RFC7471], [RFC8570], and [RFC8571]. 1665 5.3.1. Hybrid Systems 1667 TBD 1669 5.3.2. Considerations for Software Defined Networking 1671 As discussed in Section 4.1.16, one of the main drivers for SDN is a 1672 decoupling of the network control plane from the data plane 1673 [RFC7149]. Centralized control in SDN helps improve network resource 1674 utilization compared with distributed network control. 1676 TBD 1678 5.4. Local Versus Global 1680 Traffic engineering algorithms may require local or global network- 1681 state information. 1683 Local information pertains to the state of a portion of the domain. 1684 Examples include the bandwidth and packet loss rate of a particular 1685 path. Local state information may be sufficient for certain 1686 instances of distributed-controlled TEs. 1688 Global information pertains to the state of the entire domain 1689 undergoing traffic engineering. Examples include a global traffic 1690 matrix and loading information on each link throughout the domain of 1691 interest. Global state information is typically required with 1692 centralized control. Distributed TE systems may also need global 1693 information in some cases. 1695 5.5. Prescriptive Versus Descriptive 1697 TE systems may also be classified as prescriptive or descriptive. 1699 Prescriptive traffic engineering evaluates alternatives and 1700 recommends a course of action. Prescriptive traffic engineering can 1701 be further categorized as either corrective or perfective. 1702 Corrective TE prescribes a course of action to address an existing or 1703 predicted anomaly. Perfective TE prescribes a course of action to 1704 evolve and improve network performance even when no anomalies are 1705 evident. 1707 Descriptive traffic engineering, on the other hand, characterizes the 1708 state of the network and assesses the impact of various policies 1709 without recommending any particular course of action. 1711 5.5.1. Intent-Based Networking 1713 TBD 1715 5.6. Open-Loop Versus Closed-Loop 1717 Open-loop traffic engineering control is where control action does 1718 not use feedback information from the current network state. The 1719 control action may use its own local information for accounting 1720 purposes, however. 1722 Closed-loop traffic engineering control is where control action 1723 utilizes feedback information from the network state. The feedback 1724 information may be in the form of historical information or current 1725 measurement. 1727 5.7. Tactical versus Strategic 1729 Tactical traffic engineering aims to address specific performance 1730 problems (such as hot-spots) that occur in the network from a 1731 tactical perspective, without consideration of overall strategic 1732 imperatives. Without proper planning and insights, tactical TE tends 1733 to be ad hoc in nature. 1735 Strategic traffic engineering approaches the TE problem from a more 1736 organized and systematic perspective, taking into consideration the 1737 immediate and longer term consequences of specific policies and 1738 actions. 1740 6. Recommendations for Internet Traffic Engineering 1742 This section describes high-level recommendations for traffic 1743 engineering in the Internet. These recommendations are presented in 1744 general terms. 1746 The recommendations describe the capabilities needed to solve a 1747 traffic engineering problem or to achieve a traffic engineering 1748 objective. Broadly speaking, these recommendations can be 1749 categorized as either functional or non-functional recommendations. 1751 Functional recommendations for Internet traffic engineering describe 1752 the functions that a traffic engineering system should perform. 1753 These functions are needed to realize traffic engineering objectives 1754 by addressing traffic engineering problems. 1756 Non-functional recommendations for Internet traffic engineering 1757 relate to the quality attributes or state characteristics of a 1758 traffic engineering system. These recommendations may contain 1759 conflicting assertions and may sometimes be difficult to quantify 1760 precisely. 1762 6.1. Generic Non-functional Recommendations 1764 The generic non-functional recommendations for Internet traffic 1765 engineering include: usability, automation, scalability, stability, 1766 visibility, simplicity, efficiency, reliability, correctness, 1767 maintainability, extensibility, interoperability, and security. In a 1768 given context, some of these recommendations may be critical while 1769 others may be optional. Therefore, prioritization may be required 1770 during the development phase of a traffic engineering system (or 1771 components thereof) to tailor it to a specific operational context. 1773 In the following paragraphs, some of the aspects of the non- 1774 functional recommendations for Internet traffic engineering are 1775 summarized. 1777 Usability: Usability is a human factor aspect of traffic engineering 1778 systems. Usability refers to the ease with which a traffic 1779 engineering system can be deployed and operated. In general, it is 1780 desirable to have a TE system that can be readily deployed in an 1781 existing network. It is also desirable to have a TE system that is 1782 easy to operate and maintain. 1784 Automation: Whenever feasible, a traffic engineering system should 1785 automate as many traffic engineering functions as possible to 1786 minimize the amount of human effort needed to control and analyze 1787 operational networks. Automation is particularly imperative in large 1788 scale public networks because of the high cost of the human aspects 1789 of network operations and the high risk of network problems caused by 1790 human errors. Automation may entail the incorporation of automatic 1791 feedback and intelligence into some components of the traffic 1792 engineering system. 1794 Scalability: Contemporary public networks are growing very fast with 1795 respect to network size and traffic volume. Therefore, a TE system 1796 should be scalable to remain applicable as the network evolves. In 1797 particular, a TE system should remain functional as the network 1798 expands with regard to the number of routers and links, and with 1799 respect to the traffic volume. A TE system should have a scalable 1800 architecture, should not adversely impair other functions and 1801 processes in a network element, and should not consume too much 1802 network resources when collecting and distributing state information 1803 or when exerting control. 1805 Stability: Stability is a very important consideration in traffic 1806 engineering systems that respond to changes in the state of the 1807 network. State-dependent traffic engineering methodologies typically 1808 mandate a tradeoff between responsiveness and stability. It is 1809 strongly recommended that when tradeoffs are warranted between 1810 responsiveness and stability, that the tradeoff should be made in 1811 favor of stability (especially in public IP backbone networks). 1813 Flexibility: A TE system should be flexible to allow for changes in 1814 optimization policy. In particular, a TE system should provide 1815 sufficient configuration options so that a network administrator can 1816 tailor the TE system to a particular environment. It may also be 1817 desirable to have both online and offline TE subsystems which can be 1818 independently enabled and disabled. TE systems that are used in 1819 multi-class networks should also have options to support class based 1820 performance evaluation and optimization. 1822 Visibility: As part of the TE system, mechanisms should exist to 1823 collect statistics from the network and to analyze these statistics 1824 to determine how well the network is functioning. Derived statistics 1825 such as traffic matrices, link utilization, latency, packet loss, and 1826 other performance measures of interest which are determined from 1827 network measurements can be used as indicators of prevailing network 1828 conditions. Other examples of status information which should be 1829 observed include existing functional routing information 1830 (additionally, in the context of MPLS existing LSP routes), etc. 1832 Simplicity: Generally, a TE system should be as simple as possible. 1833 More importantly, the TE system should be relatively easy to use 1834 (i.e., clean, convenient, and intuitive user interfaces). Simplicity 1835 in user interface does not necessarily imply that the TE system will 1836 use naive algorithms. When complex algorithms and internal 1837 structures are used, such complexities should be hidden as much as 1838 possible from the network administrator through the user interface. 1840 Interoperability: Whenever feasible, traffic engineering systems and 1841 their components should be developed with open standards based 1842 interfaces to allow interoperation with other systems and components. 1844 Security: Security is a critical consideration in traffic engineering 1845 systems. Such traffic engineering systems typically exert control 1846 over certain functional aspects of the network to achieve the desired 1847 performance objectives. Therefore, adequate measures must be taken 1848 to safeguard the integrity of the traffic engineering system. 1849 Adequate measures must also be taken to protect the network from 1850 vulnerabilities that originate from security breaches and other 1851 impairments within the traffic engineering system. 1853 The remainder of this section will focus on some of the high-level 1854 functional recommendations for traffic engineering. 1856 6.2. Routing Recommendations 1858 Routing control is a significant aspect of Internet traffic 1859 engineering. Routing impacts many of the key performance measures 1860 associated with networks, such as throughput, delay, and utilization. 1861 Generally, it is very difficult to provide good service quality in a 1862 wide area network without effective routing control. A desirable 1863 routing system is one that takes traffic characteristics and network 1864 constraints into account during route selection while maintaining 1865 stability. 1867 Traditional shortest path first (SPF) interior gateway protocols are 1868 based on shortest path algorithms and have limited control 1869 capabilities for traffic engineering [RFC2702], [AWD2]. These 1870 limitations include: 1872 1. The well known issues with pure SPF protocols, which do not take 1873 network constraints and traffic characteristics into account 1874 during route selection. For example, since IGPs always use the 1875 shortest paths (based on administratively assigned link metrics) 1876 to forward traffic, load sharing cannot be accomplished among 1877 paths of different costs. Using shortest paths to forward 1878 traffic conserves network resources, but may cause the following 1879 problems: 1) If traffic from a source to a destination exceeds 1880 the capacity of a link along the shortest path, the link (hence 1881 the shortest path) becomes congested while a longer path between 1882 these two nodes may be under-utilized; 2) the shortest paths from 1883 different sources can overlap at some links. If the total 1884 traffic from the sources exceeds the capacity of any of these 1885 links, congestion will occur. Problems can also occur because 1886 traffic demand changes over time but network topology and routing 1887 configuration cannot be changed as rapidly. This causes the 1888 network topology and routing configuration to become sub-optimal 1889 over time, which may result in persistent congestion problems. 1891 2. The Equal-Cost Multi-Path (ECMP) capability of SPF IGPs supports 1892 sharing of traffic among equal cost paths between two nodes. 1893 However, ECMP attempts to divide the traffic as equally as 1894 possible among the equal cost shortest paths. Generally, ECMP 1895 does not support configurable load sharing ratios among equal 1896 cost paths. The result is that one of the paths may carry 1897 significantly more traffic than other paths because it may also 1898 carry traffic from other sources. This situation can result in 1899 congestion along the path that carries more traffic. 1901 3. Modifying IGP metrics to control traffic routing tends to have 1902 network-wide effect. Consequently, undesirable and unanticipated 1903 traffic shifts can be triggered as a result. Recent work 1904 described in Section 8 may be capable of better control [FT00], 1905 [FT01]. 1907 Because of these limitations, new capabilities are needed to enhance 1908 the routing function in IP networks. Some of these capabilities have 1909 been described elsewhere and are summarized below. 1911 Constraint-based routing is desirable to evolve the routing 1912 architecture of IP networks, especially public IP backbones with 1913 complex topologies [RFC2702]. Constraint-based routing computes 1914 routes to fulfill requirements subject to constraints. Constraints 1915 may include bandwidth, hop count, delay, and administrative policy 1916 instruments such as resource class attributes [RFC2702], [RFC2386]. 1917 This makes it possible to select routes that satisfy a given set of 1918 requirements subject to network and administrative policy 1919 constraints. Routes computed through constraint-based routing are 1920 not necessarily the shortest paths. Constraint-based routing works 1921 best with path oriented technologies that support explicit routing, 1922 such as MPLS. 1924 Constraint-based routing can also be used as a way to redistribute 1925 traffic onto the infrastructure (even for best effort traffic). For 1926 example, if the bandwidth requirements for path selection and 1927 reservable bandwidth attributes of network links are appropriately 1928 defined and configured, then congestion problems caused by uneven 1929 traffic distribution may be avoided or reduced. In this way, the 1930 performance and efficiency of the network can be improved. 1932 A number of enhancements are needed to conventional link state IGPs, 1933 such as OSPF and IS-IS, to allow them to distribute additional state 1934 information required for constraint-based routing. These extensions 1935 to OSPF were described in [RFC3630] and to IS-IS in [RFC5305]. 1936 Essentially, these enhancements require the propagation of additional 1937 information in link state advertisements. Specifically, in addition 1938 to normal link-state information, an enhanced IGP is required to 1939 propagate topology state information needed for constraint-based 1940 routing. Some of the additional topology state information include 1941 link attributes such as reservable bandwidth and link resource class 1942 attribute (an administratively specified property of the link). The 1943 resource class attribute concept was defined in [RFC2702]. The 1944 additional topology state information is carried in new TLVs and sub- 1945 TLVs in IS-IS, or in the Opaque LSA in OSPF [RFC5305], [RFC3630]. 1947 An enhanced link-state IGP may flood information more frequently than 1948 a normal IGP. This is because even without changes in topology, 1949 changes in reservable bandwidth or link affinity can trigger the 1950 enhanced IGP to initiate flooding. A tradeoff is typically required 1951 between the timeliness of the information flooded and the flooding 1952 frequency to avoid excessive consumption of link bandwidth and 1953 computational resources, and more importantly, to avoid instability. 1955 In a TE system, it is also desirable for the routing subsystem to 1956 make the load splitting ratio among multiple paths (with equal cost 1957 or different cost) configurable. This capability gives network 1958 administrators more flexibility in the control of traffic 1959 distribution across the network. It can be very useful for avoiding/ 1960 relieving congestion in certain situations. Examples can be found in 1961 [XIAO]. 1963 The routing system should also have the capability to control the 1964 routes of subsets of traffic without affecting the routes of other 1965 traffic if sufficient resources exist for this purpose. This 1966 capability allows a more refined control over the distribution of 1967 traffic across the network. For example, the ability to move traffic 1968 from a source to a destination away from its original path to another 1969 path (without affecting other traffic paths) allows traffic to be 1970 moved from resource-poor network segments to resource-rich segments. 1971 Path oriented technologies such as MPLS inherently support this 1972 capability as discussed in [AWD2]. 1974 Additionally, the routing subsystem should be able to select 1975 different paths for different classes of traffic (or for different 1976 traffic behavior aggregates) if the network supports multiple classes 1977 of service (different behavior aggregates). 1979 6.3. Traffic Mapping Recommendations 1981 Traffic mapping pertains to the assignment of traffic workload onto 1982 pre-established paths to meet certain requirements. Thus, while 1983 constraint-based routing deals with path selection, traffic mapping 1984 deals with the assignment of traffic to established paths which may 1985 have been selected by constraint-based routing or by some other 1986 means. Traffic mapping can be performed by time-dependent or state- 1987 dependent mechanisms, as described in Section 5.1. 1989 An important aspect of the traffic mapping function is the ability to 1990 establish multiple paths between an originating node and a 1991 destination node, and the capability to distribute the traffic 1992 between the two nodes across the paths according to some policies. A 1993 pre-condition for this scheme is the existence of flexible mechanisms 1994 to partition traffic and then assign the traffic partitions onto the 1995 parallel paths. This requirement was noted in [RFC2702]. When 1996 traffic is assigned to multiple parallel paths, it is recommended 1997 that special care should be taken to ensure proper ordering of 1998 packets belonging to the same application (or micro-flow) at the 1999 destination node of the parallel paths. 2001 As a general rule, mechanisms that perform the traffic mapping 2002 functions should aim to map the traffic onto the network 2003 infrastructure to minimize congestion. If the total traffic load 2004 cannot be accommodated, or if the routing and mapping functions 2005 cannot react fast enough to changing traffic conditions, then a 2006 traffic mapping system may rely on short time scale congestion 2007 control mechanisms (such as queue management, scheduling, etc.) to 2008 mitigate congestion. Thus, mechanisms that perform the traffic 2009 mapping functions should complement existing congestion control 2010 mechanisms. In an operational network, it is generally desirable to 2011 map the traffic onto the infrastructure such that intra-class and 2012 inter-class resource contention are minimized. 2014 When traffic mapping techniques that depend on dynamic state feedback 2015 (e.g., MATE and such like) are used, special care must be taken to 2016 guarantee network stability. 2018 6.4. Measurement Recommendations 2020 The importance of measurement in traffic engineering has been 2021 discussed throughout this document. Mechanisms should be provided to 2022 measure and collect statistics from the network to support the 2023 traffic engineering function. Additional capabilities may be needed 2024 to help in the analysis of the statistics. The actions of these 2025 mechanisms should not adversely affect the accuracy and integrity of 2026 the statistics collected. The mechanisms for statistical data 2027 acquisition should also be able to scale as the network evolves. 2029 Traffic statistics may be classified according to long-term or short- 2030 term timescales. Long-term timescale traffic statistics are very 2031 useful for traffic engineering. Long-term time scale traffic 2032 statistics may capture or reflect periodicity in network workload 2033 (such as hourly, daily, and weekly variations in traffic profiles) as 2034 well as traffic trends. Aspects of the monitored traffic statistics 2035 may also depict class of service characteristics for a network 2036 supporting multiple classes of service. Analysis of the long-term 2037 traffic statistics may yield secondary statistics such as busy hour 2038 characteristics, traffic growth patterns, persistent congestion 2039 problems, hot-spot, and imbalances in link utilization caused by 2040 routing anomalies. 2042 A mechanism for constructing traffic matrices for both long-term and 2043 short-term traffic statistics should be in place. In multi-service 2044 IP networks, the traffic matrices may be constructed for different 2045 service classes. Each element of a traffic matrix represents a 2046 statistic of traffic flow between a pair of abstract nodes. An 2047 abstract node may represent a router, a collection of routers, or a 2048 site in a VPN. 2050 Measured traffic statistics should provide reasonable and reliable 2051 indicators of the current state of the network on the short-term 2052 scale. Some short term traffic statistics may reflect link 2053 utilization and link congestion status. Examples of congestion 2054 indicators include excessive packet delay, packet loss, and high 2055 resource utilization. Examples of mechanisms for distributing this 2056 kind of information include SNMP, probing techniques, FTP, IGP link 2057 state advertisements, etc. 2059 6.5. Network Survivability 2061 Network survivability refers to the capability of a network to 2062 maintain service continuity in the presence of faults. This can be 2063 accomplished by promptly recovering from network impairments and 2064 maintaining the required QoS for existing services after recovery. 2065 Survivability has become an issue of great concern within the 2066 Internet community due to the increasing demands to carry mission 2067 critical traffic, real-time traffic, and other high priority traffic 2068 over the Internet. Survivability can be addressed at the device 2069 level by developing network elements that are more reliable; and at 2070 the network level by incorporating redundancy into the architecture, 2071 design, and operation of networks. It is recommended that a 2072 philosophy of robustness and survivability should be adopted in the 2073 architecture, design, and operation of traffic engineering that 2074 control IP networks (especially public IP networks). Because 2075 different contexts may demand different levels of survivability, the 2076 mechanisms developed to support network survivability should be 2077 flexible so that they can be tailored to different needs. A number 2078 of tools and techniques have been developed to enable network 2079 survivability including MPLS Fast Reroute [RFC4090], RSVP-TE 2080 Extensions in Support of End-to-End Generalized Multi-Protocol Label 2081 Switching (GMPLS) Recovery [RFC4872], and GMPLS Segment Recovery 2082 [RFC4873]. 2084 Failure protection and restoration capabilities have become available 2085 from multiple layers as network technologies have continued to 2086 improve. At the bottom of the layered stack, optical networks are 2087 now capable of providing dynamic ring and mesh restoration 2088 functionality at the wavelength level as well as traditional 2089 protection functionality. At the SONET/SDH layer survivability 2090 capability is provided with Automatic Protection Switching (APS) as 2091 well as self-healing ring and mesh architectures. Similar 2092 functionality is provided by layer 2 technologies such as ATM 2093 (generally with slower mean restoration times). Rerouting is 2094 traditionally used at the IP layer to restore service following link 2095 and node outages. Rerouting at the IP layer occurs after a period of 2096 routing convergence which may require seconds to minutes to complete. 2097 Some new developments in the MPLS context make it possible to achieve 2098 recovery at the IP layer prior to convergence [RFC3469]. 2100 To support advanced survivability requirements, path-oriented 2101 technologies such a MPLS can be used to enhance the survivability of 2102 IP networks in a potentially cost effective manner. The advantages 2103 of path oriented technologies such as MPLS for IP restoration becomes 2104 even more evident when class based protection and restoration 2105 capabilities are required. 2107 Recently, a common suite of control plane protocols has been proposed 2108 for both MPLS and optical transport networks under the acronym Multi- 2109 protocol Lambda Switching [AWD1]. This new paradigm of Multi- 2110 protocol Lambda Switching will support even more sophisticated mesh 2111 restoration capabilities at the optical layer for the emerging IP 2112 over WDM network architectures. 2114 Another important aspect regarding multi-layer survivability is that 2115 technologies at different layers provide protection and restoration 2116 capabilities at different temporal granularities (in terms of time 2117 scales) and at different bandwidth granularity (from packet-level to 2118 wavelength level). Protection and restoration capabilities can also 2119 be sensitive to different service classes and different network 2120 utility models. 2122 The impact of service outages varies significantly for different 2123 service classes depending upon the effective duration of the outage. 2124 The duration of an outage can vary from milliseconds (with minor 2125 service impact) to seconds (with possible call drops for IP telephony 2126 and session time-outs for connection oriented transactions) to 2127 minutes and hours (with potentially considerable social and business 2128 impact). 2130 Coordinating different protection and restoration capabilities across 2131 multiple layers in a cohesive manner to ensure network survivability 2132 is maintained at reasonable cost is a challenging task. Protection 2133 and restoration coordination across layers may not always be 2134 feasible, because networks at different layers may belong to 2135 different administrative domains. 2137 The following paragraphs present some of the general recommendations 2138 for protection and restoration coordination. 2140 o Protection and restoration capabilities from different layers 2141 should be coordinated whenever feasible and appropriate to provide 2142 network survivability in a flexible and cost effective manner. 2143 Minimization of function duplication across layers is one way to 2144 achieve the coordination. Escalation of alarms and other fault 2145 indicators from lower to higher layers may also be performed in a 2146 coordinated manner. A temporal order of restoration trigger 2147 timing at different layers is another way to coordinate multi- 2148 layer protection/restoration. 2150 o Spare capacity at higher layers is often regarded as working 2151 traffic at lower layers. Placing protection/restoration functions 2152 in many layers may increase redundancy and robustness, but it 2153 should not result in significant and avoidable inefficiencies in 2154 network resource utilization. 2156 o It is generally desirable to have protection and restoration 2157 schemes that are bandwidth efficient. 2159 o Failure notification throughout the network should be timely and 2160 reliable. 2162 o Alarms and other fault monitoring and reporting capabilities 2163 should be provided at appropriate layers. 2165 6.5.1. Survivability in MPLS Based Networks 2167 MPLS is an important emerging technology that enhances IP networks in 2168 terms of features, capabilities, and services. Because MPLS is path- 2169 oriented, it can potentially provide faster and more predictable 2170 protection and restoration capabilities than conventional hop by hop 2171 routed IP systems. This subsection describes some of the basic 2172 aspects and recommendations for MPLS networks regarding protection 2173 and restoration. See [RFC3469] for a more comprehensive discussion 2174 on MPLS based recovery. 2176 Protection types for MPLS networks can be categorized as link 2177 protection, node protection, path protection, and segment protection. 2179 o Link Protection: The objective for link protection is to protect 2180 an LSP from a given link failure. Under link protection, the path 2181 of the protection or backup LSP (the secondary LSP) is disjoint 2182 from the path of the working or operational LSP at the particular 2183 link over which protection is required. When the protected link 2184 fails, traffic on the working LSP is switched over to the 2185 protection LSP at the head-end of the failed link. This is a 2186 local repair method which can be fast. It might be more 2187 appropriate in situations where some network elements along a 2188 given path are less reliable than others. 2190 o Node Protection: The objective of LSP node protection is to 2191 protect an LSP from a given node failure. Under node protection, 2192 the path of the protection LSP is disjoint from the path of the 2193 working LSP at the particular node to be protected. The secondary 2194 path is also disjoint from the primary path at all links 2195 associated with the node to be protected. When the node fails, 2196 traffic on the working LSP is switched over to the protection LSP 2197 at the upstream LSR directly connected to the failed node. 2199 o Path Protection: The goal of LSP path protection is to protect an 2200 LSP from failure at any point along its routed path. Under path 2201 protection, the path of the protection LSP is completely disjoint 2202 from the path of the working LSP. The advantage of path 2203 protection is that the backup LSP protects the working LSP from 2204 all possible link and node failures along the path, except for 2205 failures that might occur at the ingress and egress LSRs, or for 2206 correlated failures that might impact both working and backup 2207 paths simultaneously. Additionally, since the path selection is 2208 end-to-end, path protection might be more efficient in terms of 2209 resource usage than link or node protection. However, path 2210 protection may be slower than link and node protection in general. 2212 o Segment Protection: An MPLS domain may be partitioned into 2213 multiple protection domains whereby a failure in a protection 2214 domain is rectified within that domain. In cases where an LSP 2215 traverses multiple protection domains, a protection mechanism 2216 within a domain only needs to protect the segment of the LSP that 2217 lies within the domain. Segment protection will generally be 2218 faster than path protection because recovery generally occurs 2219 closer to the fault. 2221 6.5.2. Protection Option 2223 Another issue to consider is the concept of protection options. The 2224 protection option uses the notation m:n protection, where m is the 2225 number of protection LSPs used to protect n working LSPs. Feasible 2226 protection options follow. 2228 o 1:1: one working LSP is protected/restored by one protection LSP. 2230 o 1:n: one protection LSP is used to protect/restore n working LSPs. 2232 o n:1: one working LSP is protected/restored by n protection LSPs, 2233 possibly with configurable load splitting ratio. When more than 2234 one protection LSP is used, it may be desirable to share the 2235 traffic across the protection LSPs when the working LSP fails to 2236 satisfy the bandwidth requirement of the traffic trunk associated 2237 with the working LSP. This may be especially useful when it is 2238 not feasible to find one path that can satisfy the bandwidth 2239 requirement of the primary LSP. 2241 o 1+1: traffic is sent concurrently on both the working LSP and the 2242 protection LSP. In this case, the egress LSR selects one of the 2243 two LSPs based on a local traffic integrity decision process, 2244 which compares the traffic received from both the working and the 2245 protection LSP and identifies discrepancies. It is unlikely that 2246 this option would be used extensively in IP networks due to its 2247 resource utilization inefficiency. However, if bandwidth becomes 2248 plentiful and cheap, then this option might become quite viable 2249 and attractive in IP networks. 2251 6.6. Traffic Engineering in Diffserv Environments 2253 This section provides an overview of the traffic engineering features 2254 and recommendations that are specifically pertinent to Differentiated 2255 Services (Diffserv) [RFC2475] capable IP networks. 2257 Increasing requirements to support multiple classes of traffic, such 2258 as best effort and mission critical data, in the Internet calls for 2259 IP networks to differentiate traffic according to some criteria, and 2260 to accord preferential treatment to certain types of traffic. Large 2261 numbers of flows can be aggregated into a few behavior aggregates 2262 based on some criteria in terms of common performance requirements in 2263 terms of packet loss ratio, delay, and jitter; or in terms of common 2264 fields within the IP packet headers. 2266 As Diffserv evolves and becomes deployed in operational networks, 2267 traffic engineering will be critical to ensuring that SLAs defined 2268 within a given Diffserv service model are met. Classes of service 2269 (CoS) can be supported in a Diffserv environment by concatenating 2270 per-hop behaviors (PHBs) along the routing path, using service 2271 provisioning mechanisms, and by appropriately configuring edge 2272 functionality such as traffic classification, marking, policing, and 2273 shaping. PHB is the forwarding behavior that a packet receives at a 2274 DS node (a Diffserv-compliant node). This is accomplished by means 2275 of buffer management and packet scheduling mechanisms. In this 2276 context, packets belonging to a class are those that are members of a 2277 corresponding ordering aggregate. 2279 Traffic engineering can be used as a compliment to Diffserv 2280 mechanisms to improve utilization of network resources, but not as a 2281 necessary element in general. When traffic engineering is used, it 2282 can be operated on an aggregated basis across all service classes 2284 [RFC3270] or on a per service class basis. The former is used to 2285 provide better distribution of the aggregate traffic load over the 2286 network resources. (See [RFC3270] for detailed mechanisms to support 2287 aggregate traffic engineering.) The latter case is discussed below 2288 since it is specific to the Diffserv environment, with so called 2289 Diffserv-aware traffic engineering [RFC4124]. 2291 For some Diffserv networks, it may be desirable to control the 2292 performance of some service classes by enforcing certain 2293 relationships between the traffic workload contributed by each 2294 service class and the amount of network resources allocated or 2295 provisioned for that service class. Such relationships between 2296 demand and resource allocation can be enforced using a combination 2297 of, for example: 2299 o traffic engineering mechanisms on a per service class basis that 2300 enforce the desired relationship between the amount of traffic 2301 contributed by a given service class and the resources allocated 2302 to that class 2304 o mechanisms that dynamically adjust the resources allocated to a 2305 given service class to relate to the amount of traffic contributed 2306 by that service class. 2308 It may also be desirable to limit the performance impact of high 2309 priority traffic on relatively low priority traffic. This can be 2310 achieved by, for example, controlling the percentage of high priority 2311 traffic that is routed through a given link. Another way to 2312 accomplish this is to increase link capacities appropriately so that 2313 lower priority traffic can still enjoy adequate service quality. 2314 When the ratio of traffic workload contributed by different service 2315 classes vary significantly from router to router, it may not suffice 2316 to rely exclusively on conventional IGP routing protocols or on 2317 traffic engineering mechanisms that are insensitive to different 2318 service classes. Instead, it may be desirable to perform traffic 2319 engineering, especially routing control and mapping functions, on a 2320 per service class basis. One way to accomplish this in a domain that 2321 supports both MPLS and Diffserv is to define class specific LSPs and 2322 to map traffic from each class onto one or more LSPs that correspond 2323 to that service class. An LSP corresponding to a given service class 2324 can then be routed and protected/restored in a class dependent 2325 manner, according to specific policies. 2327 Performing traffic engineering on a per class basis may require 2328 certain per-class parameters to be distributed. Note that it is 2329 common to have some classes share some aggregate constraint (e.g., 2330 maximum bandwidth requirement) without enforcing the constraint on 2331 each individual class. These classes then can be grouped into a 2332 class-type and per-class-type parameters can be distributed instead 2333 to improve scalability. It also allows better bandwidth sharing 2334 between classes in the same class-type. A class-type is a set of 2335 classes that satisfy the following two conditions: 2337 o Classes in the same class-type have common aggregate requirements 2338 to satisfy required performance levels. 2340 o There is no requirement to be enforced at the level of individual 2341 class in the class-type. Note that it is still possible, 2342 nevertheless, to implement some priority policies for classes in 2343 the same class-type to permit preferential access to the class- 2344 type bandwidth through the use of preemption priorities. 2346 An example of the class-type can be a low-loss class-type that 2347 includes both AF1-based and AF2-based Ordering Aggregates. With such 2348 a class-type, one may implement some priority policy which assigns 2349 higher preemption priority to AF1-based traffic trunks over AF2-based 2350 ones, vice versa, or the same priority. 2352 See [RFC4124] for detailed requirements on Diffserv-aware traffic 2353 engineering. 2355 6.7. Network Controllability 2357 Off-line (and on-line) traffic engineering considerations would be of 2358 limited utility if the network could not be controlled effectively to 2359 implement the results of TE decisions and to achieve desired network 2360 performance objectives. Capacity augmentation is a coarse grained 2361 solution to traffic engineering issues. However, it is simple and 2362 may be advantageous if bandwidth is abundant and cheap or if the 2363 current or expected network workload demands it. However, bandwidth 2364 is not always abundant and cheap, and the workload may not always 2365 demand additional capacity. Adjustments of administrative weights 2366 and other parameters associated with routing protocols provide finer 2367 grained control, but is difficult to use and imprecise because of the 2368 routing interactions that occur across the network. In certain 2369 network contexts, more flexible, finer grained approaches which 2370 provide more precise control over the mapping of traffic to routes 2371 and over the selection and placement of routes may be appropriate and 2372 useful. 2374 Control mechanisms can be manual (e.g., administrative 2375 configuration), partially-automated (e.g., scripts) or fully- 2376 automated (e.g., policy based management systems). Automated 2377 mechanisms are particularly required in large scale networks. Multi- 2378 vendor interoperability can be facilitated by developing and 2379 deploying standardized management systems (e.g., standard MIBs) and 2380 policies (PIBs) to support the control functions required to address 2381 traffic engineering objectives such as load distribution and 2382 protection/restoration. 2384 Network control functions should be secure, reliable, and stable as 2385 these are often needed to operate correctly in times of network 2386 impairments (e.g., during network congestion or security attacks). 2388 7. Inter-Domain Considerations 2390 Inter-domain traffic engineering is concerned with the performance 2391 optimization for traffic that originates in one administrative domain 2392 and terminates in a different one. 2394 Traffic exchange between autonomous systems in the Internet occurs 2395 through exterior gateway protocols. Currently, BGP [RFC4271] is the 2396 standard exterior gateway protocol for the Internet. BGP provides a 2397 number of attributes and capabilities (e.g., route filtering) that 2398 can be used for inter-domain traffic engineering. More specifically, 2399 BGP permits the control of routing information and traffic exchange 2400 between Autonomous Systems (ASes) in the Internet. BGP incorporates 2401 a sequential decision process which calculates the degree of 2402 preference for various routes to a given destination network. There 2403 are two fundamental aspects to inter-domain traffic engineering using 2404 BGP: 2406 o Route Redistribution: controlling the import and export of routes 2407 between AS's, and controlling the redistribution of routes between 2408 BGP and other protocols within an AS. 2410 o Best path selection: selecting the best path when there are 2411 multiple candidate paths to a given destination network. Best 2412 path selection is performed by the BGP decision process based on a 2413 sequential procedure, taking a number of different considerations 2414 into account. Ultimately, best path selection under BGP boils 2415 down to selecting preferred exit points out of an AS towards 2416 specific destination networks. The BGP path selection process can 2417 be influenced by manipulating the attributes associated with the 2418 BGP decision process. These attributes include: NEXT-HOP, WEIGHT 2419 (Cisco proprietary which is also implemented by some other 2420 vendors), LOCAL-PREFERENCE, AS-PATH, ROUTE-ORIGIN, MULTI-EXIT- 2421 DESCRIMINATOR (MED), IGP METRIC, etc. 2423 Route-maps provide the flexibility to implement complex BGP policies 2424 based on pre-configured logical conditions. In particular, Route- 2425 maps can be used to control import and export policies for incoming 2426 and outgoing routes, control the redistribution of routes between BGP 2427 and other protocols, and influence the selection of best paths by 2428 manipulating the attributes associated with the BGP decision process. 2429 Very complex logical expressions that implement various types of 2430 policies can be implemented using a combination of Route-maps, BGP- 2431 attributes, Access-lists, and Community attributes. 2433 When looking at possible strategies for inter-domain TE with BGP, it 2434 must be noted that the outbound traffic exit point is controllable, 2435 whereas the interconnection point where inbound traffic is received 2436 from an EBGP peer typically is not, unless a special arrangement is 2437 made with the peer sending the traffic. Therefore, it is up to each 2438 individual network to implement sound TE strategies that deal with 2439 the efficient delivery of outbound traffic from one's customers to 2440 one's peering points. The vast majority of TE policy is based upon a 2441 "closest exit" strategy, which offloads interdomain traffic at the 2442 nearest outbound peer point towards the destination autonomous 2443 system. Most methods of manipulating the point at which inbound 2444 traffic enters a network from an EBGP peer (inconsistent route 2445 announcements between peering points, AS pre-pending, and sending 2446 MEDs) are either ineffective, or not accepted in the peering 2447 community. 2449 Inter-domain TE with BGP is generally effective, but it is usually 2450 applied in a trial-and-error fashion. A systematic approach for 2451 inter-domain traffic engineering is yet to be devised. 2453 Inter-domain TE is inherently more difficult than intra-domain TE 2454 under the current Internet architecture. The reasons for this are 2455 both technical and administrative. Technically, while topology and 2456 link state information are helpful for mapping traffic more 2457 effectively, BGP does not propagate such information across domain 2458 boundaries for stability and scalability reasons. Administratively, 2459 there are differences in operating costs and network capacities 2460 between domains. Generally, what may be considered a good solution 2461 in one domain may not necessarily be a good solution in another 2462 domain. Moreover, it would generally be considered inadvisable for 2463 one domain to permit another domain to influence the routing and 2464 management of traffic in its network. 2466 MPLS TE-tunnels (explicit LSPs) can potentially add a degree of 2467 flexibility in the selection of exit points for inter-domain routing. 2468 The concept of relative and absolute metrics can be applied to this 2469 purpose. The idea is that if BGP attributes are defined such that 2470 the BGP decision process depends on IGP metrics to select exit points 2471 for inter-domain traffic, then some inter-domain traffic destined to 2472 a given peer network can be made to prefer a specific exit point by 2473 establishing a TE-tunnel between the router making the selection to 2474 the peering point via a TE-tunnel and assigning the TE-tunnel a 2475 metric which is smaller than the IGP cost to all other peering 2476 points. If a peer accepts and processes MEDs, then a similar MPLS 2477 TE-tunnel based scheme can be applied to cause certain entrance 2478 points to be preferred by setting MED to be an IGP cost, which has 2479 been modified by the tunnel metric. 2481 Similar to intra-domain TE, inter-domain TE is best accomplished when 2482 a traffic matrix can be derived to depict the volume of traffic from 2483 one autonomous system to another. 2485 Generally, redistribution of inter-domain traffic requires 2486 coordination between peering partners. An export policy in one 2487 domain that results in load redistribution across peer points with 2488 another domain can significantly affect the local traffic matrix 2489 inside the domain of the peering partner. This, in turn, will affect 2490 the intra-domain TE due to changes in the spatial distribution of 2491 traffic. Therefore, it is mutually beneficial for peering partners 2492 to coordinate with each other before attempting any policy changes 2493 that may result in significant shifts in inter-domain traffic. In 2494 certain contexts, this coordination can be quite challenging due to 2495 technical and non- technical reasons. 2497 It is a matter of speculation as to whether MPLS, or similar 2498 technologies, can be extended to allow selection of constrained paths 2499 across domain boundaries. 2501 8. Overview of Contemporary TE Practices in Operational IP Networks 2503 This section provides an overview of some contemporary traffic 2504 engineering practices in IP networks. The focus is primarily on the 2505 aspects that pertain to the control of the routing function in 2506 operational contexts. The intent here is to provide an overview of 2507 the commonly used practices. The discussion is not intended to be 2508 exhaustive. 2510 Currently, service providers apply many of the traffic engineering 2511 mechanisms discussed in this document to optimize the performance of 2512 their IP networks. These techniques include capacity planning for 2513 long timescales, routing control using IGP metrics and MPLS for 2514 medium timescales, the overlay model also for medium timescales, and 2515 traffic management mechanisms for short timescale. 2517 When a service provider plans to build an IP network, or expand the 2518 capacity of an existing network, effective capacity planning should 2519 be an important component of the process. Such plans may take the 2520 following aspects into account: location of new nodes if any, 2521 existing and predicted traffic patterns, costs, link capacity, 2522 topology, routing design, and survivability. 2524 Performance optimization of operational networks is usually an 2525 ongoing process in which traffic statistics, performance parameters, 2526 and fault indicators are continually collected from the network. 2527 This empirical data is then analyzed and used to trigger various 2528 traffic engineering mechanisms. Tools that perform what-if analysis 2529 can also be used to assist the TE process by allowing various 2530 scenarios to be reviewed before a new set of configurations are 2531 implemented in the operational network. 2533 Traditionally, intra-domain real-time TE with IGP is done by 2534 increasing the OSPF or IS-IS metric of a congested link until enough 2535 traffic has been diverted from that link. This approach has some 2536 limitations as discussed in Section 6.2. Recently, some new intra- 2537 domain TE approaches/tools have been proposed [RR94] [FT00] [FT01] 2538 [WANG]. Such approaches/tools take traffic matrix, network topology, 2539 and network performance objectives as input, and produce some link 2540 metrics and possibly some unequal load-sharing ratios to be set at 2541 the head-end routers of some ECMPs as output. These new progresses 2542 open new possibility for intra-domain TE with IGP to be done in a 2543 more systematic way. 2545 The overlay model (IP over ATM, or IP over Frame Relay) is another 2546 approach which was commonly used [AWD2], but has been replaced by 2547 MPLS and router hardware technology. 2549 Deployment of MPLS for traffic engineering applications has commenced 2550 in some service provider networks. One operational scenario is to 2551 deploy MPLS in conjunction with an IGP (IS-IS-TE or OSPF-TE) that 2552 supports the traffic engineering extensions, in conjunction with 2553 constraint-based routing for explicit route computations, and a 2554 signaling protocol (e.g., RSVP-TE) for LSP instantiation. 2556 In contemporary MPLS traffic engineering contexts, network 2557 administrators specify and configure link attributes and resource 2558 constraints such as maximum reservable bandwidth and resource class 2559 attributes for links (interfaces) within the MPLS domain. A link 2560 state protocol that supports TE extensions (IS-IS-TE or OSPF-TE) is 2561 used to propagate information about network topology and link 2562 attribute to all routers in the routing area. Network administrators 2563 also specify all the LSPs that are to originate each router. For 2564 each LSP, the network administrator specifies the destination node 2565 and the attributes of the LSP which indicate the requirements that to 2566 be satisfied during the path selection process. Each router then 2567 uses a local constraint-based routing process to compute explicit 2568 paths for all LSPs originating from it. Subsequently, a signaling 2569 protocol is used to instantiate the LSPs. By assigning proper 2570 bandwidth values to links and LSPs, congestion caused by uneven 2571 traffic distribution can generally be avoided or mitigated. 2573 The bandwidth attributes of LSPs used for traffic engineering can be 2574 updated periodically. The basic concept is that the bandwidth 2575 assigned to an LSP should relate in some manner to the bandwidth 2576 requirements of traffic that actually flows through the LSP. The 2577 traffic attribute of an LSP can be modified to accommodate traffic 2578 growth and persistent traffic shifts. If network congestion occurs 2579 due to some unexpected events, existing LSPs can be rerouted to 2580 alleviate the situation or network administrator can configure new 2581 LSPs to divert some traffic to alternative paths. The reservable 2582 bandwidth of the congested links can also be reduced to force some 2583 LSPs to be rerouted to other paths. 2585 In an MPLS domain, a traffic matrix can also be estimated by 2586 monitoring the traffic on LSPs. Such traffic statistics can be used 2587 for a variety of purposes including network planning and network 2588 optimization. Current practice suggests that deploying an MPLS 2589 network consisting of hundreds of routers and thousands of LSPs is 2590 feasible. In summary, recent deployment experience suggests that 2591 MPLS approach is very effective for traffic engineering in IP 2592 networks [XIAO]. 2594 As mentioned previously in Section 7, one usually has no direct 2595 control over the distribution of inbound traffic. Therefore, the 2596 main goal of contemporary inter-domain TE is to optimize the 2597 distribution of outbound traffic between multiple inter-domain links. 2598 When operating a global network, maintaining the ability to operate 2599 the network in a regional fashion where desired, while continuing to 2600 take advantage of the benefits of a global network, also becomes an 2601 important objective. 2603 Inter-domain TE with BGP usually begins with the placement of 2604 multiple peering interconnection points in locations that have high 2605 peer density, are in close proximity to originating/terminating 2606 traffic locations on one's own network, and are lowest in cost. 2607 There are generally several locations in each region of the world 2608 where the vast majority of major networks congregate and 2609 interconnect. Some location-decision problems that arise in 2610 association with inter-domain routing are discussed in [AWD5]. 2612 Once the locations of the interconnects are determined, and circuits 2613 are implemented, one decides how best to handle the routes heard from 2614 the peer, as well as how to propagate the peers' routes within one's 2615 own network. One way to engineer outbound traffic flows on a network 2616 with many EBGP peers is to create a hierarchy of peers. Generally, 2617 the Local Preferences of all peers are set to the same value so that 2618 the shortest AS paths will be chosen to forward traffic. Then, by 2619 over-writing the inbound MED metric (Multi-exit-discriminator metric, 2620 also referred to as "BGP metric". Both terms are used 2621 interchangeably in this document) with BGP metrics to routes received 2622 at different peers, the hierarchy can be formed. For example, all 2623 Local Preferences can be set to 200, preferred private peers can be 2624 assigned a BGP metric of 50, the rest of the private peers can be 2625 assigned a BGP metric of 100, and public peers can be assigned a BGP 2626 metric of 600. "Preferred" peers might be defined as those peers 2627 with whom the most available capacity exists, whose customer base is 2628 larger in comparison to other peers, whose interconnection costs are 2629 the lowest, and with whom upgrading existing capacity is the easiest. 2630 In a network with low utilization at the edge, this works well. The 2631 same concept could be applied to a network with higher edge 2632 utilization by creating more levels of BGP metrics between peers, 2633 allowing for more granularity in selecting the exit points for 2634 traffic bound for a dual homed customer on a peer's network. 2636 By only replacing inbound MED metrics with BGP metrics, only equal 2637 AS-Path length routes' exit points are being changed. (The BGP 2638 decision considers Local Preference first, then AS-Path length, and 2639 then BGP metric). For example, assume a network has two possible 2640 egress points, peer A and peer B. Each peer has 40% of the 2641 Internet's routes exclusively on its network, while the remaining 20% 2642 of the Internet's routes are from customers who dual home between A 2643 and B. Assume that both peers have a Local Preference of 200 and a 2644 BGP metric of 100. If the link to peer A is congested, increasing 2645 its BGP metric while leaving the Local Preference at 200 will ensure 2646 that the 20% of total routes belonging to dual homed customers will 2647 prefer peer B as the exit point. The previous example would be used 2648 in a situation where all exit points to a given peer were close to 2649 congestion levels, and traffic needed to be shifted away from that 2650 peer entirely. 2652 When there are multiple exit points to a given peer, and only one of 2653 them is congested, it is not necessary to shift traffic away from the 2654 peer entirely, but only from the one congested circuit. This can be 2655 achieved by using passive IGP-metrics, AS-path filtering, or prefix 2656 filtering. 2658 Occasionally, more drastic changes are needed, for example, in 2659 dealing with a "problem peer" who is difficult to work with on 2660 upgrades or is charging high prices for connectivity to their 2661 network. In that case, the Local Preference to that peer can be 2662 reduced below the level of other peers. This effectively reduces the 2663 amount of traffic sent to that peer to only originating traffic 2664 (assuming no transit providers are involved). This type of change 2665 can affect a large amount of traffic, and is only used after other 2666 methods have failed to provide the desired results. 2668 Although it is not much of an issue in regional networks, the 2669 propagation of a peer's routes back through the network must be 2670 considered when a network is peering on a global scale. Sometimes, 2671 business considerations can influence the choice of BGP policies in a 2672 given context. For example, it may be imprudent, from a business 2673 perspective, to operate a global network and provide full access to 2674 the global customer base to a small network in a particular country. 2675 However, for the purpose of providing one's own customers with 2676 quality service in a particular region, good connectivity to that in- 2677 country network may still be necessary. This can be achieved by 2678 assigning a set of communities at the edge of the network, which have 2679 a known behavior when routes tagged with those communities are 2680 propagating back through the core. Routes heard from local peers 2681 will be prevented from propagating back to the global network, 2682 whereas routes learned from larger peers may be allowed to propagate 2683 freely throughout the entire global network. By implementing a 2684 flexible community strategy, the benefits of using a single global AS 2685 Number (ASN) can be realized, while the benefits of operating 2686 regional networks can also be taken advantage of. An alternative to 2687 doing this is to use different ASNs in different regions, with the 2688 consequence that the AS path length for routes announced by that 2689 service provider will increase. 2691 9. Conclusion 2693 This document described principles for traffic engineering in the 2694 Internet. It presented an overview of some of the basic issues 2695 surrounding traffic engineering in IP networks. The context of TE 2696 was described, a TE process models and a taxonomy of TE styles were 2697 presented. A brief historical review of pertinent developments 2698 related to traffic engineering was provided. A survey of 2699 contemporary TE techniques in operational networks was presented. 2700 Additionally, the document specified a set of generic requirements, 2701 recommendations, and options for Internet traffic engineering. 2703 10. Security Considerations 2705 This document does not introduce new security issues. 2707 11. IANA Considerations 2709 This draft makes no requests for IANA action. 2711 12. Acknowledgments 2713 Much of the text in this document is derived from RFC 3272. The 2714 authors of this document would like to express their gratitude to all 2715 involved in that work. Although the source text has been edited in 2716 the production of this document, the orginal authors should be 2717 considered as Contributors to this work. They were: 2719 Daniel O. Awduche 2720 Movaz Networks 2721 7926 Jones Branch Drive, Suite 615 2722 McLean, VA 22102 2724 Phone: 703-298-5291 2725 EMail: awduche@movaz.com 2727 Angela Chiu 2728 Celion Networks 2729 1 Sheila Dr., Suite 2 2730 Tinton Falls, NJ 07724 2732 Phone: 732-747-9987 2733 EMail: angela.chiu@celion.com 2735 Anwar Elwalid 2736 Lucent Technologies 2737 Murray Hill, NJ 07974 2739 Phone: 908 582-7589 2740 EMail: anwar@lucent.com 2742 Indra Widjaja 2743 Bell Labs, Lucent Technologies 2744 600 Mountain Avenue 2745 Murray Hill, NJ 07974 2747 Phone: 908 582-0435 2748 EMail: iwidjaja@research.bell-labs.com 2750 XiPeng Xiao 2751 Redback Networks 2752 300 Holger Way 2753 San Jose, CA 95134 2755 Phone: 408-750-5217 2756 EMail: xipeng@redback.com 2758 The acknowledgements in RFC3272 were as below. All people who helped 2759 in the production of that document also need to be thanked for the 2760 carry-over into this new document. 2762 The authors would like to thank Jim Boyle for inputs on the 2763 recommendations section, Francois Le Faucheur for inputs on 2764 Diffserv aspects, Blaine Christian for inputs on measurement, 2765 Gerald Ash for inputs on routing in telephone networks and for 2766 text on event-dependent TE methods, Steven Wright for inputs 2767 on network controllability, and Jonathan Aufderheide for 2768 inputs on inter-domain TE with BGP. Special thanks to 2769 Randy Bush for proposing the TE taxonomy based on "tactical versus 2770 strategic" methods. The subsection describing an "Overview of 2771 ITU Activities Related to Traffic Engineering" was adapted from 2772 a contribution by Waisum Lai. Useful feedback and pointers to 2773 relevant materials were provided by J. Noel Chiappa. 2774 Additional comments were provided by Glenn Grotefeld during 2775 the working last call process. Finally, the authors would like 2776 to thank Ed Kern, the TEWG co-chair, for his comments and 2777 support. 2779 The early versions of this document were produced by the TEAS Working 2780 Group's RFC3272bis Design Team. The full list of members of this 2781 team is: 2783 Acee Lindem 2784 Adrian Farrel 2785 Aijun Wang 2786 Daniele Ceccarelli 2787 Dieter Beller 2788 Jeff Tantsura 2789 Julien Meuric 2790 Liu Hua 2791 Loa Andersson 2792 Luis Miguel Contreras 2793 Martin Horneffer 2794 Tarek Saad 2795 Xufeng Liu 2797 The production of this document includes a fix to the original text 2798 resulting from an Errata Report by Jean-Michel Grimaldi. 2800 The authors of this document would also like to thank Dhurv Dhody for 2801 review comments. 2803 13. Contributors 2805 The following people contributed substantive text to this document: 2807 Gert Grammel 2808 EMail: ggrammel@juniper.net 2810 Loa Andersson 2811 EMail: loa@pi.nu 2813 Xufeng Liu 2814 EMail: xufeng.liu.ietf@gmail.com 2816 Lou Berger 2817 EMail: lberger@labn.net 2819 Jeff Tantsura 2820 EMail: jefftant.ietf@gmail.com 2822 14. Informative References 2824 [AJ19] Adekitan, A., Abolade, J., and O. Shobayo, "Data mining 2825 approach for predicting the daily Internet data traffic of 2826 a smart university", Article Journal of Big Data, 2019, 2827 Volume 6, Number 1, Page 1, 1998. 2829 [ASH2] Ash, J., "Dynamic Routing in Telecommunications Networks", 2830 Book McGraw Hill, 1998. 2832 [AWD1] Awduche, D. and Y. Rekhter, "Multiprocotol Lambda 2833 Switching - Combining MPLS Traffic Engineering Control 2834 with Optical Crossconnects", Article IEEE Communications 2835 Magazine, March 2001. 2837 [AWD2] Awduche, D., "MPLS and Traffic Engineering in IP 2838 Networks", Article IEEE Communications Magazine, December 2839 1999. 2841 [AWD5] Awduche, D., "An Approach to Optimal Peering Between 2842 Autonomous Systems in the Internet", Paper International 2843 Conference on Computer Communications and Networks 2844 (ICCCN'98), October 1998. 2846 [FLJA93] Floyd, S. and V. Jacobson, "Random Early Detection 2847 Gateways for Congestion Avoidance", Article IEEE/ACM 2848 Transactions on Networking, Vol. 1, p. 387-413, November 2849 1993. 2851 [FLOY94] Floyd, S., "TCP and Explicit Congestion Notification", 2852 Article ACM Computer Communication Review, V. 24, No. 5, 2853 p. 10-23, October 1994. 2855 [FT00] Fortz, B. and M. Thorup, "Internet Traffic Engineering by 2856 Optimizing OSPF Weights", Article IEEE INFOCOM 2000, March 2857 2000. 2859 [FT01] Fortz, B. and M. Thorup, "Optimizing OSPF/IS-IS Weights in 2860 a Changing World", n.d., 2861 . 2863 [HUSS87] Hurley, B., Seidl, C., and W. Sewel, "A Survey of Dynamic 2864 Routing Methods for Circuit-Switched Traffic", 2865 Article IEEE Communication Magazine, September 1987. 2867 [I-D.ietf-idr-rfc5575bis] 2868 Loibl, C., Hares, S., Raszuk, R., McPherson, D., and M. 2869 Bacher, "Dissemination of Flow Specification Rules", 2870 draft-ietf-idr-rfc5575bis-27 (work in progress), October 2871 2020. 2873 [I-D.ietf-tewg-qos-routing] 2874 Ash, G., "Traffic Engineering & QoS Methods for IP-, ATM-, 2875 & Based Multiservice Networks", draft-ietf-tewg-qos- 2876 routing-04 (work in progress), October 2001. 2878 [ITU-E600] 2879 "Terms and Definitions of Traffic Engineering", 2880 Recommendation ITU-T Recommendation E.600, March 1993. 2882 [ITU-E701] 2883 "Reference Connections for Traffic Engineering", 2884 Recommendation ITU-T Recommendation E.701, October 1993. 2886 [ITU-E801] 2887 "Framework for Service Quality Agreement", 2888 Recommendation ITU-T Recommendation E.801, October 1996. 2890 [MA] Ma, Q., "Quality of Service Routing in Integrated Services 2891 Networks", Ph.D. PhD Dissertation, CMU-CS-98-138, CMU, 2892 1998. 2894 [MATE] Elwalid, A., Jin, C., Low, S., and I. Widjaja, "MATE - 2895 MPLS Adaptive Traffic Engineering", 2896 Proceedings INFOCOM'01, April 2001. 2898 [MCQ80] McQuillan, J., Richer, I., and E. Rosen, "The New Routing 2899 Algorithm for the ARPANET", Transaction IEEE Transactions 2900 on Communications, vol. 28, no. 5, p. 711-719, May 1980. 2902 [MR99] Mitra, D. and K. Ramakrishnan, "A Case Study of 2903 Multiservice, Multipriority Traffic Engineering Design for 2904 Data Networks", Proceedings Globecom'99, December 1999. 2906 [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, 2907 DOI 10.17487/RFC0791, September 1981, 2908 . 2910 [RFC1102] Clark, D., "Policy routing in Internet protocols", 2911 RFC 1102, DOI 10.17487/RFC1102, May 1989, 2912 . 2914 [RFC1104] Braun, H., "Models of policy based routing", RFC 1104, 2915 DOI 10.17487/RFC1104, June 1989, 2916 . 2918 [RFC1992] Castineyra, I., Chiappa, N., and M. Steenstrup, "The 2919 Nimrod Routing Architecture", RFC 1992, 2920 DOI 10.17487/RFC1992, August 1996, 2921 . 2923 [RFC2205] Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and S. 2924 Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 2925 Functional Specification", RFC 2205, DOI 10.17487/RFC2205, 2926 September 1997, . 2928 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 2929 DOI 10.17487/RFC2328, April 1998, 2930 . 2932 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 2933 "Framework for IP Performance Metrics", RFC 2330, 2934 DOI 10.17487/RFC2330, May 1998, 2935 . 2937 [RFC2386] Crawley, E., Nair, R., Rajagopalan, B., and H. Sandick, "A 2938 Framework for QoS-based Routing in the Internet", 2939 RFC 2386, DOI 10.17487/RFC2386, August 1998, 2940 . 2942 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 2943 "Definition of the Differentiated Services Field (DS 2944 Field) in the IPv4 and IPv6 Headers", RFC 2474, 2945 DOI 10.17487/RFC2474, December 1998, 2946 . 2948 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 2949 and W. Weiss, "An Architecture for Differentiated 2950 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 2951 . 2953 [RFC2597] Heinanen, J., Baker, F., Weiss, W., and J. Wroclawski, 2954 "Assured Forwarding PHB Group", RFC 2597, 2955 DOI 10.17487/RFC2597, June 1999, 2956 . 2958 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 2959 Connectivity", RFC 2678, DOI 10.17487/RFC2678, September 2960 1999, . 2962 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and J. 2963 McManus, "Requirements for Traffic Engineering Over MPLS", 2964 RFC 2702, DOI 10.17487/RFC2702, September 1999, 2965 . 2967 [RFC2722] Brownlee, N., Mills, C., and G. Ruth, "Traffic Flow 2968 Measurement: Architecture", RFC 2722, 2969 DOI 10.17487/RFC2722, October 1999, 2970 . 2972 [RFC2753] Yavatkar, R., Pendarakis, D., and R. Guerin, "A Framework 2973 for Policy-based Admission Control", RFC 2753, 2974 DOI 10.17487/RFC2753, January 2000, 2975 . 2977 [RFC2961] Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F., 2978 and S. Molendini, "RSVP Refresh Overhead Reduction 2979 Extensions", RFC 2961, DOI 10.17487/RFC2961, April 2001, 2980 . 2982 [RFC2998] Bernet, Y., Ford, P., Yavatkar, R., Baker, F., Zhang, L., 2983 Speer, M., Braden, R., Davie, B., Wroclawski, J., and E. 2984 Felstaine, "A Framework for Integrated Services Operation 2985 over Diffserv Networks", RFC 2998, DOI 10.17487/RFC2998, 2986 November 2000, . 2988 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 2989 Label Switching Architecture", RFC 3031, 2990 DOI 10.17487/RFC3031, January 2001, 2991 . 2993 [RFC3086] Nichols, K. and B. Carpenter, "Definition of 2994 Differentiated Services Per Domain Behaviors and Rules for 2995 their Specification", RFC 3086, DOI 10.17487/RFC3086, 2996 April 2001, . 2998 [RFC3124] Balakrishnan, H. and S. Seshan, "The Congestion Manager", 2999 RFC 3124, DOI 10.17487/RFC3124, June 2001, 3000 . 3002 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 3003 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 3004 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 3005 . 3007 [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, 3008 P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- 3009 Protocol Label Switching (MPLS) Support of Differentiated 3010 Services", RFC 3270, DOI 10.17487/RFC3270, May 2002, 3011 . 3013 [RFC3272] Awduche, D., Chiu, A., Elwalid, A., Widjaja, I., and X. 3014 Xiao, "Overview and Principles of Internet Traffic 3015 Engineering", RFC 3272, DOI 10.17487/RFC3272, May 2002, 3016 . 3018 [RFC3469] Sharma, V., Ed. and F. Hellstrand, Ed., "Framework for 3019 Multi-Protocol Label Switching (MPLS)-based Recovery", 3020 RFC 3469, DOI 10.17487/RFC3469, February 2003, 3021 . 3023 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 3024 (TE) Extensions to OSPF Version 2", RFC 3630, 3025 DOI 10.17487/RFC3630, September 2003, 3026 . 3028 [RFC3945] Mannie, E., Ed., "Generalized Multi-Protocol Label 3029 Switching (GMPLS) Architecture", RFC 3945, 3030 DOI 10.17487/RFC3945, October 2004, 3031 . 3033 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 3034 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 3035 DOI 10.17487/RFC4090, May 2005, 3036 . 3038 [RFC4124] Le Faucheur, F., Ed., "Protocol Extensions for Support of 3039 Diffserv-aware MPLS Traffic Engineering", RFC 4124, 3040 DOI 10.17487/RFC4124, June 2005, 3041 . 3043 [RFC4203] Kompella, K., Ed. and Y. Rekhter, Ed., "OSPF Extensions in 3044 Support of Generalized Multi-Protocol Label Switching 3045 (GMPLS)", RFC 4203, DOI 10.17487/RFC4203, October 2005, 3046 . 3048 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 3049 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 3050 DOI 10.17487/RFC4271, January 2006, 3051 . 3053 [RFC4594] Babiarz, J., Chan, K., and F. Baker, "Configuration 3054 Guidelines for DiffServ Service Classes", RFC 4594, 3055 DOI 10.17487/RFC4594, August 2006, 3056 . 3058 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 3059 Element (PCE)-Based Architecture", RFC 4655, 3060 DOI 10.17487/RFC4655, August 2006, 3061 . 3063 [RFC4872] Lang, J., Ed., Rekhter, Y., Ed., and D. Papadimitriou, 3064 Ed., "RSVP-TE Extensions in Support of End-to-End 3065 Generalized Multi-Protocol Label Switching (GMPLS) 3066 Recovery", RFC 4872, DOI 10.17487/RFC4872, May 2007, 3067 . 3069 [RFC4873] Berger, L., Bryskin, I., Papadimitriou, D., and A. Farrel, 3070 "GMPLS Segment Recovery", RFC 4873, DOI 10.17487/RFC4873, 3071 May 2007, . 3073 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 3074 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 3075 2008, . 3077 [RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream 3078 Label Assignment and Context-Specific Label Space", 3079 RFC 5331, DOI 10.17487/RFC5331, August 2008, 3080 . 3082 [RFC5394] Bryskin, I., Papadimitriou, D., Berger, L., and J. Ash, 3083 "Policy-Enabled Path Computation Framework", RFC 5394, 3084 DOI 10.17487/RFC5394, December 2008, 3085 . 3087 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 3088 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 3089 DOI 10.17487/RFC5440, March 2009, 3090 . 3092 [RFC6119] Harrison, J., Berger, J., and M. Bartlett, "IPv6 Traffic 3093 Engineering in IS-IS", RFC 6119, DOI 10.17487/RFC6119, 3094 February 2011, . 3096 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 3097 and A. Bierman, Ed., "Network Configuration Protocol 3098 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 3099 . 3101 [RFC6374] Frost, D. and S. Bryant, "Packet Loss and Delay 3102 Measurement for MPLS Networks", RFC 6374, 3103 DOI 10.17487/RFC6374, September 2011, 3104 . 3106 [RFC6805] King, D., Ed. and A. Farrel, Ed., "The Application of the 3107 Path Computation Element Architecture to the Determination 3108 of a Sequence of Domains in MPLS and GMPLS", RFC 6805, 3109 DOI 10.17487/RFC6805, November 2012, 3110 . 3112 [RFC7149] Boucadair, M. and C. Jacquenet, "Software-Defined 3113 Networking: A Perspective from within a Service Provider 3114 Environment", RFC 7149, DOI 10.17487/RFC7149, March 2014, 3115 . 3117 [RFC7390] Rahman, A., Ed. and E. Dijk, Ed., "Group Communication for 3118 the Constrained Application Protocol (CoAP)", RFC 7390, 3119 DOI 10.17487/RFC7390, October 2014, 3120 . 3122 [RFC7471] Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 3123 Previdi, "OSPF Traffic Engineering (TE) Metric 3124 Extensions", RFC 7471, DOI 10.17487/RFC7471, March 2015, 3125 . 3127 [RFC7679] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3128 Ed., "A One-Way Delay Metric for IP Performance Metrics 3129 (IPPM)", STD 81, RFC 7679, DOI 10.17487/RFC7679, January 3130 2016, . 3132 [RFC7680] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3133 Ed., "A One-Way Loss Metric for IP Performance Metrics 3134 (IPPM)", STD 82, RFC 7680, DOI 10.17487/RFC7680, January 3135 2016, . 3137 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 3138 S. Ray, "North-Bound Distribution of Link-State and 3139 Traffic Engineering (TE) Information Using BGP", RFC 7752, 3140 DOI 10.17487/RFC7752, March 2016, 3141 . 3143 [RFC7923] Voit, E., Clemm, A., and A. Gonzalez Prieto, "Requirements 3144 for Subscription to YANG Datastores", RFC 7923, 3145 DOI 10.17487/RFC7923, June 2016, 3146 . 3148 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 3149 RFC 7950, DOI 10.17487/RFC7950, August 2016, 3150 . 3152 [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF 3153 Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, 3154 . 3156 [RFC8051] Zhang, X., Ed. and I. Minei, Ed., "Applicability of a 3157 Stateful Path Computation Element (PCE)", RFC 8051, 3158 DOI 10.17487/RFC8051, January 2017, 3159 . 3161 [RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path 3162 Computation Element Communication Protocol (PCEP) 3163 Extensions for Stateful PCE", RFC 8231, 3164 DOI 10.17487/RFC8231, September 2017, 3165 . 3167 [RFC8281] Crabbe, E., Minei, I., Sivabalan, S., and R. Varga, "Path 3168 Computation Element Communication Protocol (PCEP) 3169 Extensions for PCE-Initiated LSP Setup in a Stateful PCE 3170 Model", RFC 8281, DOI 10.17487/RFC8281, December 2017, 3171 . 3173 [RFC8283] Farrel, A., Ed., Zhao, Q., Ed., Li, Z., and C. Zhou, "An 3174 Architecture for Use of PCE and the PCE Communication 3175 Protocol (PCEP) in a Network with Central Control", 3176 RFC 8283, DOI 10.17487/RFC8283, December 2017, 3177 . 3179 [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., 3180 Decraene, B., Litkowski, S., and R. Shakir, "Segment 3181 Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, 3182 July 2018, . 3184 [RFC8453] Ceccarelli, D., Ed. and Y. Lee, Ed., "Framework for 3185 Abstraction and Control of TE Networks (ACTN)", RFC 8453, 3186 DOI 10.17487/RFC8453, August 2018, 3187 . 3189 [RFC8570] Ginsberg, L., Ed., Previdi, S., Ed., Giacalone, S., Ward, 3190 D., Drake, J., and Q. Wu, "IS-IS Traffic Engineering (TE) 3191 Metric Extensions", RFC 8570, DOI 10.17487/RFC8570, March 3192 2019, . 3194 [RFC8571] Ginsberg, L., Ed., Previdi, S., Wu, Q., Tantsura, J., and 3195 C. Filsfils, "BGP - Link State (BGP-LS) Advertisement of 3196 IGP Traffic Engineering Performance Metric Extensions", 3197 RFC 8571, DOI 10.17487/RFC8571, March 2019, 3198 . 3200 [RFC8661] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., 3201 Decraene, B., and S. Litkowski, "Segment Routing MPLS 3202 Interworking with LDP", RFC 8661, DOI 10.17487/RFC8661, 3203 December 2019, . 3205 [RFC8795] Liu, X., Bryskin, I., Beeram, V., Saad, T., Shah, H., and 3206 O. Gonzalez de Dios, "YANG Data Model for Traffic 3207 Engineering (TE) Topologies", RFC 8795, 3208 DOI 10.17487/RFC8795, August 2020, 3209 . 3211 [RR94] Rodrigues, M. and K. Ramakrishnan, "Optimal Routing in 3212 Shortest Path Networks", Proceedings ITS'94, Rio de 3213 Janeiro, Brazil, 1994. 3215 [SLDC98] Suter, B., Lakshman, T., Stiliadis, D., and A. Choudhury, 3216 "Design Considerations for Supporting TCP with Per-flow 3217 Queueing", Proceedings INFOCOM'98, p. 299-306, 1998. 3219 [WANG] Wang, Y., Wang, Z., and L. Zhang, "Internet traffic 3220 engineering without full mesh overlaying", 3221 Proceedings INFOCOM'2001, April 2001. 3223 [XIAO] Xiao, X., Hannan, A., Bailey, B., and L. Ni, "Traffic 3224 Engineering with MPLS in the Internet", Article IEEE 3225 Network Magazine, March 2000. 3227 [YARE95] Yang, C. and A. Reddy, "A Taxonomy for Congestion Control 3228 Algorithms in Packet Switching Networks", Article IEEE 3229 Network Magazine, p. 34-45, 1995. 3231 Appendix A. Historic Overview 3233 A.1. Traffic Engineering in Classical Telephone Networks 3235 This subsection presents a brief overview of traffic engineering in 3236 telephone networks which often relates to the way user traffic is 3237 steered from an originating node to the terminating node. This 3238 subsection presents a brief overview of this topic. A detailed 3239 description of the various routing strategies applied in telephone 3240 networks is included in the book by G. Ash [ASH2]. 3242 The early telephone network relied on static hierarchical routing, 3243 whereby routing patterns remained fixed independent of the state of 3244 the network or time of day. The hierarchy was intended to 3245 accommodate overflow traffic, improve network reliability via 3246 alternate routes, and prevent call looping by employing strict 3247 hierarchical rules. The network was typically over-provisioned since 3248 a given fixed route had to be dimensioned so that it could carry user 3249 traffic during a busy hour of any busy day. Hierarchical routing in 3250 the telephony network was found to be too rigid upon the advent of 3251 digital switches and stored program control which were able to manage 3252 more complicated traffic engineering rules. 3254 Dynamic routing was introduced to alleviate the routing inflexibility 3255 in the static hierarchical routing so that the network would operate 3256 more efficiently. This resulted in significant economic gains 3257 [HUSS87]. Dynamic routing typically reduces the overall loss 3258 probability by 10 to 20 percent (compared to static hierarchical 3259 routing). Dynamic routing can also improve network resilience by 3260 recalculating routes on a per-call basis and periodically updating 3261 routes. 3263 There are three main types of dynamic routing in the telephone 3264 network. They are time-dependent routing, state-dependent routing 3265 (SDR), and event dependent routing (EDR). 3267 In time-dependent routing, regular variations in traffic loads (such 3268 as time of day or day of week) are exploited in pre-planned routing 3269 tables. In state-dependent routing, routing tables are updated 3270 online according to the current state of the network (e.g., traffic 3271 demand, utilization, etc.). In event dependent routing, routing 3272 changes are incepted by events (such as call setups encountering 3273 congested or blocked links) whereupon new paths are searched out 3274 using learning models. EDR methods are real-time adaptive, but they 3275 do not require global state information as does SDR. Examples of EDR 3276 schemes include the dynamic alternate routing (DAR) from BT, the 3277 state-and-time dependent routing (STR) from NTT, and the success-to- 3278 the-top (STT) routing from AT&T. 3280 Dynamic non-hierarchical routing (DNHR) is an example of dynamic 3281 routing that was introduced in the AT&T toll network in the 1980's to 3282 respond to time-dependent information such as regular load variations 3283 as a function of time. Time-dependent information in terms of load 3284 may be divided into three timescales: hourly, weekly, and yearly. 3285 Correspondingly, three algorithms are defined to pre-plan the routing 3286 tables. The network design algorithm operates over a year-long 3287 interval while the demand servicing algorithm operates on a weekly 3288 basis to fine tune link sizes and routing tables to correct forecast 3289 errors on the yearly basis. At the smallest timescale, the routing 3290 algorithm is used to make limited adjustments based on daily traffic 3291 variations. Network design and demand servicing are computed using 3292 offline calculations. Typically, the calculations require extensive 3293 searches on possible routes. On the other hand, routing may need 3294 online calculations to handle crankback. DNHR adopts a "two-link" 3295 approach whereby a path can consist of two links at most. The 3296 routing algorithm presents an ordered list of route choices between 3297 an originating switch and a terminating switch. If a call overflows, 3298 a via switch (a tandem exchange between the originating switch and 3299 the terminating switch) would send a crankback signal to the 3300 originating switch. This switch would then select the next route, 3301 and so on, until there are no alternative routes available in which 3302 the call is blocked. 3304 A.2. Evolution of Traffic Engineering in Packet Networks 3306 This subsection reviews related prior work that was intended to 3307 improve the performance of data networks. Indeed, optimization of 3308 the performance of data networks started in the early days of the 3309 ARPANET. Other early commercial networks such as SNA also recognized 3310 the importance of performance optimization and service 3311 differentiation. 3313 In terms of traffic management, the Internet has been a best effort 3314 service environment until recently. In particular, very limited 3315 traffic management capabilities existed in IP networks to provide 3316 differentiated queue management and scheduling services to packets 3317 belonging to different classes. 3319 In terms of routing control, the Internet has employed distributed 3320 protocols for intra-domain routing. These protocols are highly 3321 scalable and resilient. However, they are based on simple algorithms 3322 for path selection which have very limited functionality to allow 3323 flexible control of the path selection process. 3325 In the following subsections, the evolution of practical traffic 3326 engineering mechanisms in IP networks and its predecessors are 3327 reviewed. 3329 A.2.1. Adaptive Routing in the ARPANET 3331 The early ARPANET recognized the importance of adaptive routing where 3332 routing decisions were based on the current state of the network 3333 [MCQ80]. Early minimum delay routing approaches forwarded each 3334 packet to its destination along a path for which the total estimated 3335 transit time was the smallest. Each node maintained a table of 3336 network delays, representing the estimated delay that a packet would 3337 experience along a given path toward its destination. The minimum 3338 delay table was periodically transmitted by a node to its neighbors. 3339 The shortest path, in terms of hop count, was also propagated to give 3340 the connectivity information. 3342 One drawback to this approach is that dynamic link metrics tend to 3343 create "traffic magnets" causing congestion to be shifted from one 3344 location of a network to another location, resulting in oscillation 3345 and network instability. 3347 A.2.2. Dynamic Routing in the Internet 3349 The Internet evolved from the ARPANET and adopted dynamic routing 3350 algorithms with distributed control to determine the paths that 3351 packets should take en-route to their destinations. The routing 3352 algorithms are adaptations of shortest path algorithms where costs 3353 are based on link metrics. The link metric can be based on static or 3354 dynamic quantities. The link metric based on static quantities may 3355 be assigned administratively according to local criteria. The link 3356 metric based on dynamic quantities may be a function of a network 3357 congestion measure such as delay or packet loss. 3359 It was apparent early that static link metric assignment was 3360 inadequate because it can easily lead to unfavorable scenarios in 3361 which some links become congested while others remain lightly loaded. 3362 One of the many reasons for the inadequacy of static link metrics is 3363 that link metric assignment was often done without considering the 3364 traffic matrix in the network. Also, the routing protocols did not 3365 take traffic attributes and capacity constraints into account when 3366 making routing decisions. This results in traffic concentration 3367 being localized in subsets of the network infrastructure and 3368 potentially causing congestion. Even if link metrics are assigned in 3369 accordance with the traffic matrix, unbalanced loads in the network 3370 can still occur due to a number factors including: 3372 o Resources may not be deployed in the most optimal locations from a 3373 routing perspective. 3375 o Forecasting errors in traffic volume and/or traffic distribution. 3377 o Dynamics in traffic matrix due to the temporal nature of traffic 3378 patterns, BGP policy change from peers, etc. 3380 The inadequacy of the legacy Internet interior gateway routing system 3381 is one of the factors motivating the interest in path oriented 3382 technology with explicit routing and constraint-based routing 3383 capability such as MPLS. 3385 A.2.3. ToS Routing 3387 Type-of-Service (ToS) routing involves different routes going to the 3388 same destination with selection dependent upon the ToS field of an IP 3389 packet [RFC2474]. The ToS classes may be classified as low delay and 3390 high throughput. Each link is associated with multiple link costs 3391 and each link cost is used to compute routes for a particular ToS. A 3392 separate shortest path tree is computed for each ToS. The shortest 3393 path algorithm must be run for each ToS resulting in very expensive 3394 computation. Classical ToS-based routing is now outdated as the IP 3395 header field has been replaced by a Diffserv field. Effective 3396 traffic engineering is difficult to perform in classical ToS-based 3397 routing because each class still relies exclusively on shortest path 3398 routing which results in localization of traffic concentration within 3399 the network. 3401 A.2.4. Equal Cost Multi-Path 3403 Equal Cost Multi-Path (ECMP) is another technique that attempts to 3404 address the deficiency in the Shortest Path First (SPF) interior 3405 gateway routing systems [RFC2328]. In the classical SPF algorithm, 3406 if two or more shortest paths exist to a given destination, the 3407 algorithm will choose one of them. The algorithm is modified 3408 slightly in ECMP so that if two or more equal cost shortest paths 3409 exist between two nodes, the traffic between the nodes is distributed 3410 among the multiple equal-cost paths. Traffic distribution across the 3411 equal-cost paths is usually performed in one of two ways: (1) packet- 3412 based in a round-robin fashion, or (2) flow-based using hashing on 3413 source and destination IP addresses and possibly other fields of the 3414 IP header. The first approach can easily cause out- of-order packets 3415 while the second approach is dependent upon the number and 3416 distribution of flows. Flow-based load sharing may be unpredictable 3417 in an enterprise network where the number of flows is relatively 3418 small and less heterogeneous (for example, hashing may not be 3419 uniform), but it is generally effective in core public networks where 3420 the number of flows is large and heterogeneous. 3422 In ECMP, link costs are static and bandwidth constraints are not 3423 considered, so ECMP attempts to distribute the traffic as equally as 3424 possible among the equal-cost paths independent of the congestion 3425 status of each path. As a result, given two equal-cost paths, it is 3426 possible that one of the paths will be more congested than the other. 3427 Another drawback of ECMP is that load sharing cannot be achieved on 3428 multiple paths which have non-identical costs. 3430 A.2.5. Nimrod 3432 Nimrod was a routing system developed to provide heterogeneous 3433 service specific routing in the Internet, while taking multiple 3434 constraints into account [RFC1992]. Essentially, Nimrod was a link 3435 state routing protocol to support path oriented packet forwarding. 3436 It used the concept of maps to represent network connectivity and 3437 services at multiple levels of abstraction. Mechanisms allowed 3438 restriction of the distribution of routing information. 3440 Even though Nimrod did not enjoy deployment in the public Internet, a 3441 number of key concepts incorporated into the Nimrod architecture, 3442 such as explicit routing which allows selection of paths at 3443 originating nodes, are beginning to find applications in some recent 3444 constraint-based routing initiatives. 3446 A.3. Development of Internet Traffic Engineering 3448 A.3.1. Overlay Model 3450 In the overlay model, a virtual-circuit network, such as Sonet/SDH, 3451 OTN, or WDM, provides virtual-circuit connectivity between routers 3452 that are located at the edges of a virtual-circuit cloud. In this 3453 mode, two routers that are connected through a virtual circuit see a 3454 direct adjacency between themselves independent of the physical route 3455 taken by the virtual circuit through the ATM, frame relay, or WDM 3456 network. Thus, the overlay model essentially decouples the logical 3457 topology that routers see from the physical topology that the ATM, 3458 frame relay, or WDM network manages. The overlay model based on ATM 3459 or frame relay enables a network administrator or an automaton to 3460 employ traffic engineering concepts to perform path optimization by 3461 re-configuring or rearranging the virtual circuits so that a virtual 3462 circuit on a congested or sub-optimal physical link can be re-routed 3463 to a less congested or more optimal one. In the overlay model, 3464 traffic engineering is also employed to establish relationships 3465 between the traffic management parameters (e.g., PCR, SCR, and MBS 3466 for ATM) of the virtual-circuit technology and the actual traffic 3467 that traverses each circuit. These relationships can be established 3468 based upon known or projected traffic profiles, and some other 3469 factors. 3471 Appendix B. Overview of Traffic Engineering Related Work in Other SDOs 3473 B.1. Overview of ITU Activities Related to Traffic Engineering 3475 This section provides an overview of prior work within the ITU-T 3476 pertaining to traffic engineering in traditional telecommunications 3477 networks. 3479 ITU-T Recommendations E.600 [ITU-E600], E.701 [ITU-E701], and E.801 3480 [ITU-E801] address traffic engineering issues in traditional 3481 telecommunications networks. Recommendation E.600 provides a 3482 vocabulary for describing traffic engineering concepts, while E.701 3483 defines reference connections, Grade of Service (GOS), and traffic 3484 parameters for ISDN. Recommendation E.701 uses the concept of a 3485 reference connection to identify representative cases of different 3486 types of connections without describing the specifics of their actual 3487 realizations by different physical means. As defined in 3488 Recommendation E.600, "a connection is an association of resources 3489 providing means for communication between two or more devices in, or 3490 attached to, a telecommunication network." Also, E.600 defines "a 3491 resource as any set of physically or conceptually identifiable 3492 entities within a telecommunication network, the use of which can be 3493 unambiguously determined" [ITU-E600]. There can be different types 3494 of connections as the number and types of resources in a connection 3495 may vary. 3497 Typically, different network segments are involved in the path of a 3498 connection. For example, a connection may be local, national, or 3499 international. The purposes of reference connections are to clarify 3500 and specify traffic performance issues at various interfaces between 3501 different network domains. Each domain may consist of one or more 3502 service provider networks. 3504 Reference connections provide a basis to define grade of service 3505 (GoS) parameters related to traffic engineering within the ITU-T 3506 framework. As defined in E.600, "GoS refers to a number of traffic 3507 engineering variables which are used to provide a measure of the 3508 adequacy of a group of resources under specified conditions." These 3509 GoS variables may be probability of loss, dial tone, delay, etc. 3510 They are essential for network internal design and operation as well 3511 as for component performance specification. 3513 GoS is different from quality of service (QoS) in the ITU framework. 3514 QoS is the performance perceivable by a telecommunication service 3515 user and expresses the user's degree of satisfaction of the service. 3516 QoS parameters focus on performance aspects observable at the service 3517 access points and network interfaces, rather than their causes within 3518 the network. GoS, on the other hand, is a set of network oriented 3519 measures which characterize the adequacy of a group of resources 3520 under specified conditions. For a network to be effective in serving 3521 its users, the values of both GoS and QoS parameters must be related, 3522 with GoS parameters typically making a major contribution to the QoS. 3524 Recommendation E.600 stipulates that a set of GoS parameters must be 3525 selected and defined on an end-to-end basis for each major service 3526 category provided by a network to assist the network provider with 3527 improving efficiency and effectiveness of the network. Based on a 3528 selected set of reference connections, suitable target values are 3529 assigned to the selected GoS parameters under normal and high load 3530 conditions. These end-to-end GoS target values are then apportioned 3531 to individual resource components of the reference connections for 3532 dimensioning purposes. 3534 Appendix C. Summary of Changes Since RFC 3272 3536 This section is a place-holder. It is expected that once work on 3537 this document is nearly complete, this section will be updated to 3538 provide an overview of the structural and substantive changed from 3539 RFC 3272. 3541 Author's Address 3543 Adrian Farrel (editor) 3544 Old Dog Consulting 3546 Email: adrian@olddog.co.uk