idnits 2.17.1 draft-ietf-rtgwg-cl-framework-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 15, 2013) is 3930 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-04) exists of draft-atlas-mpls-te-express-path-02 == Outdated reference: A later version (-04) exists of draft-ietf-mpls-multipath-use-00 == Outdated reference: A later version (-11) exists of draft-ietf-ospf-te-metric-extensions-04 == Outdated reference: A later version (-16) exists of draft-ietf-rtgwg-cl-requirement-11 == Outdated reference: A later version (-06) exists of draft-ietf-rtgwg-cl-use-cases-04 -- Obsolete informational reference (is this intentional?): RFC 5316 (Obsoleted by RFC 9346) Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTGWG S. Ning 3 Internet-Draft Tata Communications 4 Intended status: Informational D. McDysan 5 Expires: January 16, 2014 Verizon 6 E. Osborne 7 Cisco 8 L. Yong 9 Huawei USA 10 C. Villamizar 11 Outer Cape Cod Network 12 Consulting 13 July 15, 2013 15 Advanced Multipath Framework in MPLS 16 draft-ietf-rtgwg-cl-framework-04 18 Abstract 20 This document specifies a framework for support of Advanced Multipath 21 in MPLS networks. As defined in this framework, an Advanced 22 Multipath consists of a group of homogenous or non-homogenous links 23 that have the same forward adjacency (FA) and can be considered as a 24 single TE link or an IP link when advertised into IGP routing. 26 Status of this Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on January 16, 2014. 43 Copyright Notice 45 Copyright (c) 2013 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 62 1.2. Architecture Summary . . . . . . . . . . . . . . . . . . . 4 63 1.3. Conventions used in this document . . . . . . . . . . . . 5 64 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 65 1.5. Document Issues . . . . . . . . . . . . . . . . . . . . . 5 66 2. Advanced Multipath Key Characteristics . . . . . . . . . . . . 7 67 2.1. Flow Identification . . . . . . . . . . . . . . . . . . . 7 68 2.1.1. Flow Identification Granularity . . . . . . . . . . . 8 69 2.1.2. Flow Identification Summary . . . . . . . . . . . . . 9 70 2.1.3. Flow Identification Using Entropy Label . . . . . . . 9 71 2.2. Advanced Multipath in Control Plane . . . . . . . . . . . 10 72 2.3. Advanced Multipath in Data Plane . . . . . . . . . . . . . 13 73 3. Architecture Tradeoffs . . . . . . . . . . . . . . . . . . . . 14 74 3.1. Scalability Motivations . . . . . . . . . . . . . . . . . 14 75 3.2. Reducing Routing Information and Exchange . . . . . . . . 15 76 3.3. Reducing Signaling Load . . . . . . . . . . . . . . . . . 15 77 3.3.1. Reducing Signaling Load using LDP MPTP . . . . . . . . 16 78 3.3.2. Reducing Signaling Load using Hierarchy . . . . . . . 16 79 3.3.3. Using Both LDP MPTP and RSVP-TE Hierarchy . . . . . . 17 80 3.4. Reducing Forwarding State . . . . . . . . . . . . . . . . 17 81 3.5. Avoiding Route Oscillation . . . . . . . . . . . . . . . . 17 82 4. New Challenges . . . . . . . . . . . . . . . . . . . . . . . . 18 83 4.1. Control Plane Challenges . . . . . . . . . . . . . . . . . 19 84 4.1.1. Delay and Jitter Sensitive Routing . . . . . . . . . . 19 85 4.1.2. Local Control of Traffic Distribution . . . . . . . . 20 86 4.1.3. Path Symmetry Requirements . . . . . . . . . . . . . . 20 87 4.1.4. Requirements for Contained LSP . . . . . . . . . . . . 21 88 4.1.5. Retaining Backwards Compatibility . . . . . . . . . . 21 89 4.2. Data Plane Challenges . . . . . . . . . . . . . . . . . . 22 90 4.2.1. Very Large LSP . . . . . . . . . . . . . . . . . . . . 22 91 4.2.2. Very Large Microflows . . . . . . . . . . . . . . . . 23 92 4.2.3. Traffic Ordering Constraints . . . . . . . . . . . . . 23 93 4.2.4. Accounting for IP and LDP Traffic . . . . . . . . . . 23 94 4.2.5. IP and LDP Limitations . . . . . . . . . . . . . . . . 24 95 5. Existing Mechanisms . . . . . . . . . . . . . . . . . . . . . 25 96 5.1. Link Bundling . . . . . . . . . . . . . . . . . . . . . . 25 97 5.2. Classic Multipath . . . . . . . . . . . . . . . . . . . . 26 98 6. Mechanisms Proposed in Other Documents . . . . . . . . . . . . 27 99 6.1. Loss and Delay Measurement . . . . . . . . . . . . . . . . 27 100 6.2. Link Bundle Extensions . . . . . . . . . . . . . . . . . . 28 101 6.3. Pseudowire Flow and MPLS Entropy Labels . . . . . . . . . 28 102 6.4. Multipath Extensions . . . . . . . . . . . . . . . . . . . 29 103 7. Required Protocol Extensions and Mechanisms . . . . . . . . . 29 104 7.1. Brief Review of Requirements . . . . . . . . . . . . . . . 29 105 7.2. Proposed Document Coverage . . . . . . . . . . . . . . . . 30 106 7.2.1. Component Link Grouping . . . . . . . . . . . . . . . 31 107 7.2.2. Delay and Jitter Extensions . . . . . . . . . . . . . 31 108 7.2.3. Path Selection and Admission Control . . . . . . . . . 32 109 7.2.4. Dynamic Multipath Balance . . . . . . . . . . . . . . 32 110 7.2.5. Frequency of Load Balance . . . . . . . . . . . . . . 33 111 7.2.6. Inter-Layer Communication . . . . . . . . . . . . . . 33 112 7.2.7. Packet Ordering Requirements . . . . . . . . . . . . . 33 113 7.2.8. Minimally Disruption Load Balance . . . . . . . . . . 34 114 7.2.9. Path Symmetry . . . . . . . . . . . . . . . . . . . . 34 115 7.2.10. Performance, Scalability, and Stability . . . . . . . 35 116 7.2.11. IP and LDP Traffic . . . . . . . . . . . . . . . . . . 35 117 7.2.12. LDP Extensions . . . . . . . . . . . . . . . . . . . . 35 118 7.2.13. Pseudowire Extensions . . . . . . . . . . . . . . . . 36 119 7.2.14. Multi-Domain Advanced Multipath . . . . . . . . . . . 36 120 7.3. Framework Requirement Coverage by Protocol . . . . . . . . 36 121 7.3.1. OSPF-TE and ISIS-TE Protocol Extensions . . . . . . . 37 122 7.3.2. PW Protocol Extensions . . . . . . . . . . . . . . . . 37 123 7.3.3. LDP Protocol Extensions . . . . . . . . . . . . . . . 37 124 7.3.4. RSVP-TE Protocol Extensions . . . . . . . . . . . . . 37 125 7.3.5. RSVP-TE Path Selection Changes . . . . . . . . . . . . 37 126 7.3.6. RSVP-TE Admission Control and Preemption . . . . . . . 37 127 7.3.7. Flow Identification and Traffic Balance . . . . . . . 37 128 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 129 9. Security Considerations . . . . . . . . . . . . . . . . . . . 38 130 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38 131 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 132 11.1. Normative References . . . . . . . . . . . . . . . . . . . 39 133 11.2. Informative References . . . . . . . . . . . . . . . . . . 39 134 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42 136 1. Introduction 138 Advanced Multipath functional requirements are specified in 139 [I-D.ietf-rtgwg-cl-requirement]. Advanced Multipath use cases are 140 described in [I-D.ietf-rtgwg-cl-use-cases]. This document specifies 141 a framework to meet these requirements. 143 This document describes an Advanced Multipath framework in the 144 context of MPLS networks using an IGP-TE and RSVP-TE MPLS control 145 plane with GMPLS extensions [RFC3209] [RFC3630] [RFC3945] [RFC5305]. 147 Specific protocol solutions are outside the scope of this document, 148 however a framework for the extension of existing protocols is 149 provided. Backwards compatibility is best achieved by extending 150 existing protocols where practical rather than inventing new 151 protocols. The focus is on examining where existing protocol 152 mechanisms fall short with respect to [I-D.ietf-rtgwg-cl-requirement] 153 and on the types of extensions that will be required to accommodate 154 functionality that is called for in [I-D.ietf-rtgwg-cl-requirement]. 156 1.1. Background 158 Classic multipath, including Ethernet Link Aggregation has been 159 widely used in today's MPLS networks [RFC4385][RFC4928]. Classic 160 multipath using non-Ethernet links are often advertised using MPLS 161 Link bundling. A link bundle [RFC4201] bundles a group of 162 homogeneous links as a TE link to make IGP-TE information exchange 163 and RSVP-TE signaling more scalable. An Advanced Multipath allows 164 bundling non-homogenous links together as a single logical link. 166 An Advanced Multipath is a single logical link in MPLS network that 167 contains multiple parallel component links between two MPLS LSR. 168 Unlike a link bundle [RFC4201], the component links in an Advanced 169 Multipath can have different properties such as cost, capacity, 170 delay, or jitter. 172 1.2. Architecture Summary 174 Networks aggregate information, both in the control plane and in the 175 data plane, as a means to achieve scalability. A tradeoff exists 176 between the needs of scalability and the needs to identify differing 177 path and link characteristics and differing requirements among flows 178 contained within further aggregated traffic flows. These tradeoffs 179 are discussed in detail in Section 3. 181 Some aspects of Advanced Multipath requirements present challenges 182 for which multiple solutions may exist. In Section 4 various 183 challenges and potential approaches are discussed. 185 A subset of the functionality called for in 186 [I-D.ietf-rtgwg-cl-requirement] is available through MPLS Link 187 Bundling [RFC4201]. Link bundling and other existing standards 188 applicable to Advanced Multipath are covered in Section 5. 190 The most straightforward means of supporting Advanced Multipath 191 requirements is to extend MPLS protocols and protocol semantics and 192 in particular to extend link bundling. Extensions which have already 193 been proposed in other documents which are applicable to Advanced 194 Multipath are discussed in Section 6. 196 A goal of most new protocol work within IETF is to reuse existing 197 protocol encapsulations and mechanisms where they meet requirements 198 and extend existing mechanisms. This approach minimizes additional 199 complexity while meeting requirements and tends to preserve backwards 200 compatibility to the extent it is practical to do so. These goals 201 are considered in proposing a framework for further protocol 202 extensions and mechanisms in Section 7. 204 1.3. Conventions used in this document 206 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 207 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 208 document are to be interpreted as described in RFC 2119 [RFC2119]. 210 1.4. Terminology 212 Terminology defined in [I-D.ietf-rtgwg-cl-requirement] is used in 213 this document. The additional terms defined in 214 [I-D.ietf-rtgwg-cl-use-cases] are also used. 216 The abbreviation IGP-TE is used as a shorthand indicating either 217 OSPF-TE [RFC3630] or ISIS-TE [RFC5305]. 219 1.5. Document Issues 221 This subsection exists solely for the purpose of focusing the RTGWG 222 meeting and mailing list discussions on areas within this document 223 that need attention in order for the document to achieve the level of 224 quality necessary to advance the document through the IETF process. 225 This subsection will be removed before work group last call. 227 The following issues need to be resolved. 229 1. The feasibility of symmetric paths for all flows is questionable. 230 The only case where this is practical is where LSP are smaller 231 than component links and where classic link bundling (not using 232 the all-ones component) is used. Perhaps the emphasis on this 233 (mis)feature should be reduced in the requirements document. See 234 Section 4.1.3. 236 2. There is a tradeoff between supporting delay optimized routing 237 and avoiding oscillation. This may be sufficiently covered, but 238 a careful review by others and comments would be beneficial. 240 3. Any measurement of jitter (delay variation) that is used in route 241 decision is likely to cause oscillation. Trying to optimize a 242 path to reduce jitter may be a fools errand. How do we say this 243 in the draft or does the existing text cover it adequately? 245 4. RTGWG needs to consider the possibility of using multi-topology 246 IGP extensions in IP and LDP routing where the topologies reflect 247 differing requirements (see Section 4.2.5). This idea is similar 248 to TOS routing, which has been discussed for decades but has 249 never been deployed. One possible outcome of discussion would be 250 to declare TOS routing out of scope in the requirements document. 252 5. The following referenced drafts have expired: 254 A. [I-D.ospf-cc-stlv] 256 B. [I-D.villamizar-mpls-multipath-extn] 258 A replacement for [I-D.ospf-cc-stlv] is expected to be submitted. 259 [I-D.villamizar-mpls-multipath-extn] is expected to emerge in a 260 simplified form, removing extensions for which existing 261 workarounds are considered adequate based on feedback at a prior 262 IETF. 264 6. Clarification of what we intend to do with Multi-Domain Advanced 265 Multipath is needed in Section 7.2.14. 267 7. The following topics in the requirements document are not 268 addressed. Since they are explicitly mentioned in the 269 requirements document some mention of how they are supported is 270 needed in this document. 272 A. Migration (incremental deployment) may not be adequately 273 covered in Section 4.1.5. It might also be necessary to say 274 more here on performance, scalability, and stability as it 275 related to migration. Comments on this from co-authors or 276 the WG? 278 B. We may need a performance section in this document to 279 specifically address #DR6 (fast convergence), and #DR7 (fast 280 worst case failure convergence). We do already have 281 scalability discussion and make a recommendation for a 282 separate document. At the very least the performance section 283 would have to say "no worse than before, except were there 284 was no alternative to make it very slightly worse" (in a bit 285 more detail than that). It might also be helpful to better 286 define the nature of the performance criteria implied by #DR6 287 and #DR7. 289 The above list has been in this document for the better part of a 290 year with very little discussion (or none) of the above issues on the 291 RTGWG mailing list. 293 2. Advanced Multipath Key Characteristics 295 [I-D.ietf-rtgwg-cl-requirement] defines external behavior of Advanced 296 Multipath. The overall framework approach involves extending 297 existing protocols in a backwards compatible manner and reusing 298 ongoing work elsewhere in IETF where applicable, defining new 299 protocols or semantics only where necessary. Given the requirements, 300 and this approach of extending MPLS, Advanced Multipath key 301 characteristics can be described in greater detail than given 302 requirements alone. 304 2.1. Flow Identification 306 Traffic mapping to component links is a data plane operation. 307 Control over how the mapping is done may be directly dictated or 308 constrained by the control plane or by the management plane. When 309 unconstrained by the control plane or management plane, distribution 310 of traffic is entirely a local matter. Regardless of constraints or 311 lack or constraints, the traffic distribution is required to keep 312 packets belonging to individual flows in sequence and meet QoS 313 criteria specified per LSP by either signaling or management 314 [RFC2475] [RFC3260]. 316 Key objectives of the traffic distribution are to not overload any 317 component link, and to be be able to perform local recovery when a 318 subset of component links fails. 320 The network operator may have other objectives such as placing a 321 bidirectional flow or LSP on the same component link in both 322 direction, bounding delay and/or jitter, Advanced Multipath energy 323 saving, and etc. These new requirements are described in 324 [I-D.ietf-rtgwg-cl-requirement]. 326 Examples of means to identify a flow may in principle include: 328 1. an LSP identified by an MPLS label, 330 2. a pseudowire (PW) [RFC3985] identified by an MPLS PW label, 332 3. a flow or group of flows within a pseudowire (PW) [RFC6391] 333 identified by an MPLS flow label, 335 4. a flow or flow group in an LSP [RFC6790] identified by an MPLS 336 entropy label, 338 5. all traffic between a pair of IP hosts, identified by an IP 339 source and destination pair, 341 6. a specific connection between a pair of IP hosts, identified by 342 an IP source and destination pair, protocol, and protocol port 343 pair, 345 7. a layer-2 conversation within a pseudowire (PW), where the 346 identification is PW payload type specific, such as Ethernet MAC 347 addresses and VLAN tags within an Ethernet PW [RFC4448]. This is 348 feasible but not practical (see below). 350 Although in principle a layer-2 conversation within a pseudowire 351 (PW), may be identified by PW payload type specific information, in 352 practice this is impractical at LSP midpoints when PW are carried. 353 The PW ingress may provide equivalent information in a PW flow label 354 [RFC6391]. Therefore, in practice, item #8 above is covered by 355 [RFC6391] and may be dropped from the list. 357 2.1.1. Flow Identification Granularity 359 An LSR must at least be capable of identifying flows based on MPLS 360 labels. Most MPLS LSP do not require that traffic carried by the LSP 361 are carried in order. MPLS-TP is a recent exception. If it is 362 assumed that no LSP require strict packet ordering of the LSP itself 363 (only of flows within the LSP), then the entire label stack can be 364 used as flow identification. If some LSP may require strict packet 365 ordering but those LSP cannot be distinguished from others, then only 366 the top label can be used as a flow identifier. If only the top 367 label is used (for example, as specified by [RFC4201] when the "all- 368 ones" component described in [RFC4201] is not used), then there may 369 not be adequate flow granularity to accomplish well balanced traffic 370 distribution and it will not be possible to carry LSP that are larger 371 than any individual component link. 373 The number of flows can be extremely large. This may be the case 374 when the entire label stack is used and is always the case when IP 375 addresses are used in provider networks carrying Internet traffic. 377 Current practice for native IP load balancing at the time of writing 378 were documented in [RFC2991] and [RFC2992]. These practices as 379 described, make use of IP addresses. 381 The common practices described in [RFC2991] and [RFC2992] were 382 extended to include the MPLS label stack and the common practice of 383 looking at IP addresses within the MPLS payload. These extended 384 practices require that pseudowires use a PWE3 Control Word and are 385 described in [RFC4385] and [RFC4928]. Additional detail on current 386 multipath practices can be found in the appendices of 387 [I-D.ietf-rtgwg-cl-use-cases]. 389 Using only the top label supports too coarse a traffic balance. 390 Prior to MPLS Entropy Label [RFC6790] using the full label stack was 391 also too coarse. Using the full label stack and IP addresses as flow 392 identification provides a sufficiently fine traffic balance, but is 393 capable of identifying such a high number of distinct flows, that a 394 technique of grouping flows, such as hashing on the flow 395 identification criteria, becomes essential to reduce the stored 396 state, and is an essential scaling technique. Other means of 397 grouping flows may be possible. 399 2.1.2. Flow Identification Summary 401 In summary: 403 1. Load balancing using only the MPLS label stack provides too 404 coarse a granularity of load balance. 406 2. Tracking every flow is not scalable due to the extremely large 407 number of flows in provider networks. 409 3. Existing techniques, IP source and destination hash in 410 particular, have proven in over two decades of experience to be 411 an excellent way of identifying groups of flows. 413 4. If a better way to identify groups of flows is discovered, then 414 that method can be used. 416 5. IP address hashing is not required, but use of this technique is 417 strongly encouraged given the technique's long history of 418 successful deployment. 420 2.1.3. Flow Identification Using Entropy Label 422 MPLS Entropy Label [RFC6790] provides a means of making use of the 423 entropy from information that would require deeper packet inspection, 424 such as inspection of IP addresses, and putting that entropy in the 425 form of a hashed value into the label stack. Midpoint LSR that 426 understand the Entropy Label Indicator can make use of only label 427 stack information but still obtain a fine load balance granularity. 429 2.2. Advanced Multipath in Control Plane 431 An Advanced Multipath is advertised as a single logical interface 432 between two connected routers, which forms forwarding adjacency (FA) 433 between the routers. The FA is advertised as a TE-link in a link 434 state IGP, using either OSPF-TE or ISIS-TE. The IGP-TE advertised 435 interface parameters for the Advanced Multipath can be preconfigured 436 by the network operator or be derived from its component links. 437 Advanced Multipath advertisement requirements are specified in 438 [I-D.ietf-rtgwg-cl-requirement]. 440 In IGP-TE, an Advanced Multipath is advertised as a single TE link 441 between two connected routers. This is similar to a link bundle 442 [RFC4201]. Link bundle applies to a set of homogenous component 443 links. Advanced Multipath allows homogenous and non-homogenous 444 component links. Due to the similarity, and for backwards 445 compatibility, extending link bundling is viewed as both simple and 446 as the best approach. 448 In order for a route computation engine to calculate a proper path 449 for a LSP, it is necessary for Advanced Multipath to advertise the 450 summarized available bandwidth as well as the maximum bandwidth that 451 can be made available for single flow (or single LSP where no finer 452 flow identification is available). If an Advanced Multipath contains 453 some non-homogeneous component links, the Advanced Multipath also 454 should advertise the summarized bandwidth and the maximum bandwidth 455 for single flow per each homogeneous component link group. 457 Both LDP [RFC5036] and RSVP-TE [RFC3209] can be used to signal a LSP 458 over an Advanced Multipath. LDP cannot be extended to support 459 traffic engineering capabilities [RFC3468]. 461 When an LSP is signaled using RSVP-TE, the LSP MUST be placed on the 462 component link that meets the LSP criteria indicated in the signaling 463 message. 465 When an LSP is signaled using LDP, the LSP MUST be placed on the 466 component link that meets the LSP criteria, if such a component link 467 is available. LDP does not support traffic engineering capabilities, 468 imposing restrictions on LDP use of Advanced Multipath. See 469 Section 4.2.5 for further details. 471 If the Advanced Multipath solution is based on extensions to IGP-TE 472 and RSVP-TE, then in order to meet requirements defined in 474 [I-D.ietf-rtgwg-cl-requirement], the following derived requirements 475 MUST be met. 477 1. An Advanced Multipath MAY contain non-homogeneous component 478 links. The route computing engine MAY select one group of 479 component links for a LSP. The The route computing engine MUST 480 accommodate service objectives for a given LSP when selecting a 481 group of component links for a LSP. 483 2. The routing protocol MUST make a grouping of component links 484 available in the TE-LSDB, such that within each group all of the 485 component links have similar characteristics (the component links 486 are homogeneous within a group). 488 3. The route computation used in RSVP-TE MUST be extended to include 489 only the capacity of groups within an Advanced Multipath which 490 meet LSP criteria. 492 4. The signaling protocol MUST be able to indicate either the 493 criteria, or which groups may be used. 495 5. An Advanced Multipath MUST place each LSP on a component link or 496 group which meets or exceeds the LSP criteria. 498 Advanced Multipath capacity is aggregated capacity. LSP capacity MAY 499 be larger than individual component link capacity. Any aggregated 500 LSP can determine a bounds on the largest microflow that could be 501 carried and this constraint can be handled as follows. 503 1. If no information is available through signaling, management 504 plane, or configuration, the largest microflow is bound by one of 505 the following: 507 A. the largest single LSP if most traffic is RSVP-TE signaled 508 and further aggregated, 510 B. the largest pseudowire if most traffic is carrying pseudowire 511 payloads that are aggregated within RSVP-TE LSP, 513 C. or the largest interface or component lisk capacity carrying 514 IP or LDP if a large amount of IP or LDP traffic is contained 515 within the aggregate. 517 If a very large amount of traffic being aggregated is IP or LDP, 518 then the largest microflow is bound by the largest component link 519 on which IP traffic can arrive. For example, if an LSR is acting 520 as an LER and IP and LDP traffic is arriving on 10 Gb/s edge 521 interfaces, then no microflow larger than 10 Gb/s will be present 522 on the RSVP-TE LSP that aggregate traffic across the core, even 523 if the core interfaces are 100 Gb/s interfaces. 525 2. The prior conditions provide a bound on the largest microflow 526 when no signaling extensions indicate a bounds. If an LSP is 527 aggregating smaller LSP for which the largest expected microflow 528 carried by the smaller LSP is signaled, then the largest 529 microflow expected in the containing LSP (the aggregate) is the 530 maximum of the largest expected microflow for any contained LSP. 531 For example, RSVP-TE LSP may be large but aggregate traffic for 532 which the source or sink are all 1 Gb/s or smaller interfaces 533 (such as in mobile applications in which cell sites backhauls are 534 no larger than 1 Gb/s). If this information is carried in the 535 LSP originated at the cell sites, then further aggregates across 536 a core may make use of this information. 538 3. The IGP must provide the bounds on the largest microflow that an 539 Advanced Multipath can accommodate, which is the maximum capacity 540 on a component link that can be made available by moving other 541 traffic. This information is needed by the ingress LER for path 542 determination. 544 4. A means to signal an LSP whose capacity is larger than individual 545 component link capacity is needed [I-D.ietf-rtgwg-cl-requirement] 546 and also signal the largest microflow expected to be contained in 547 the LSP. If a bounds on the largest microflow is not signaled 548 there is no means to determine if an LSP which is larger than any 549 component link can be subdivided into flows and therefore should 550 be accepted by admission control. 552 When a bidirectional LSP request is signaled over an Advanced 553 Multipath, if the request indicates that the LSP must be placed on 554 the same component link, the routers of the Advanced Multipath MUST 555 place the LSP traffic in both directions on a same component link. 556 This is particularly challenging for aggregated capacity which makes 557 use of the label stack for traffic distribution. The two 558 requirements are mutually exclusive for any one LSP. No one LSP may 559 be both larger than any individual component link and require 560 symmetrical paths for every flow. Both requirements can be 561 accommodated by the same Advanced Multipath for different LSP, with 562 any one LSP requiring no more than one of these two features. 564 Individual component link may fail independently. Upon component 565 link failure, an Advanced Multipath MUST support a minimally 566 disruptive local repair, preempting any LSP which can no longer be 567 supported. Available capacity in other component links MUST be used 568 to carry impacted traffic. The available bandwidth after failure 569 MUST be advertised immediately to avoid looped crankback. 571 When an Advanced Multipath is not able to transport all flows, it 572 preempts some flows based upon holding priority and informs the 573 control plane of these preempted flows. To minimize impact on 574 traffic, the Advanced Multipath MUST support soft preemption 575 [RFC5712]. The network operator SHOULD enable soft preemption. This 576 action ensures the remaining traffic is transported properly. FR#10 577 requires that the traffic be restored. FR#12 requires that any 578 change be minimally disruptive. These two requirements are 579 interpreted to include preemption among the types of changes that 580 must be minimally disruptive. 582 2.3. Advanced Multipath in Data Plane 584 The data plane must identify groups of flows. Flow identification is 585 covered in Section 2.1. Having identified groups of flows the groups 586 must be placed on individual component links. This step following 587 flow group identification is called traffic distribution or traffic 588 placement. The two steps together are known as traffic balancing or 589 load balancing. 591 Traffic distribution may be determined by or constrained by control 592 plane or management plane. Traffic distribution may be changed due 593 to component link status change, subject to constraints imposed by 594 either the management plane or control plane. The distribution 595 function is local to the routers in which an Advanced Multipath 596 belongs to and its implementation is not specified here. 598 When performing traffic placement, an Advanced Multipath does not 599 differentiate multicast traffic vs. unicast traffic. 601 In order to maintain scalability, existing data plane forwarding 602 retains state associated with the top label only. Using UHP (UHP is 603 the absence of the more common PHP), zero of more labels may be POPed 604 and packet and byte counters incremented prior to processing what 605 becomes the top label after the POP operations are completed. Flow 606 group identification may be a parallel step in the forwarding 607 process. Data plane forwarding makes use of the top label to select 608 an Advanced Multipath, or a group of components within an Advanced 609 Multipath or for the case where an LSP is pinned (see [RFC4201]), a 610 specific component link. For those LSP for which the LSP selects 611 only the Advanced Multipath or a group of components within an 612 Advanced Multipath, the load balancing makes use of the set of 613 component links selected based on the top label, and makes use of the 614 flow group identification to select among that group. 616 The simplest traffic placement techniques uses a modulo operation 617 after computing a hash. This techniques has significant 618 disadvantages. The most common traffic placement techniques uses the 619 a flow group identification as an index into a table. The table 620 provides an indirection. The number of bits of hash is constrained 621 to keep table size small. While this is not the best technique, it 622 is the most common. Better techniques exist but they are outside the 623 scope of this document and some are considered proprietary. 625 Requirements to limit frequency of load balancing can be adhered to 626 by keeping track of when a flow group was last moved and imposing a 627 minimum period before that flow group can be moved again. This is 628 straightforward for a table approach. For other approaches it may be 629 less straightforward. 631 3. Architecture Tradeoffs 633 Scalability and stability are critical considerations in protocol 634 design where protocols may be used in a large network such as today's 635 service provider networks. Advanced Multipath is applicable to 636 networks which are large enough to require that traffic be split over 637 multiple paths. Scalability is a major consideration for networks 638 that reach a capacity large enough to require Advanced Multipath. 640 Some of the requirements of Advanced Multipath could potentially have 641 a negative impact on scalability. This section is about 642 architectural tradeoffs, many motivated by the need to maintain 643 scalability and stability, a need which is reflected in 644 [I-D.ietf-rtgwg-cl-requirement], specifically in DR#6 and DR#7. 646 3.1. Scalability Motivations 648 In the interest of scalability, information is aggregated in 649 situations where information about a large amount of network capacity 650 or a large amount of network demand provides is adequate to meet 651 requirements. Routing information is aggregated to reduce the amount 652 of information exchange related to routing and to simplify route 653 computation (see Section 3.2). 655 In an MPLS network large routing changes can occur when a single 656 fault occurs. For example, a single fault may impact a very large 657 number of LSP traversing a given link. As new LSP are signaled to 658 avoid the fault, resources are consumed elsewhere, and routing 659 protocol announcements must flood the resource changes. If 660 protection is in place, there is less urgency to converging quickly. 661 If multiple faults occur that are not covered by shared risk groups 662 (SRG), then some protection may fail, adding urgency to converging 663 quickly even where protection is deployed. 665 Reducing the amount of information allows the exchange of information 666 during a large routing change to be accomplished more quickly and 667 simplifies route computation. Simplifying route computation improves 668 convergence time after very significant network faults which cannot 669 be handled by preprovisioned or precomputed protection mechanisms. 670 Aggregating smaller LSP into larger LSP is a means to reduce path 671 computation load and reduce RSVP-TE signaling (see Section 3.3). 673 Neglecting scaling issues can result in performance issues, such as 674 slow convergence. Neglecting scaling in some cases can result in 675 networks which perform so poorly as to become unstable. 677 3.2. Reducing Routing Information and Exchange 679 Link bundling provides a means of aggregating control plane 680 information. Even where the all-ones component link supported by 681 link bundling is not used, the amount of control information is 682 reduced by the number of component links in a bundle. 684 Fully deaggregating link bundle information would negate this 685 benefit. If there is a need to deaggregate, such as to distinguish 686 between groups of links within specified ranges of delay, then no 687 more deaggregation than is necessary should be done. 689 For example, in supporting the requirement for heterogeneous 690 component links, it makes little sense to fully deaggregate link 691 bundles when adding support for groups of component links with common 692 attributes within a link bundle can maintain most of the benefit of 693 aggregation while adequately supporting the requirement to support 694 heterogeneous component links. 696 Routing information exchange is also reduced by making sensible 697 choices regarding the amount of change to link parameters that 698 require link readvertisement. For example, if delay measurements 699 include queuing delay, then a much more coarse granularity of delay 700 measurement would be called for than if the delay does not include 701 queuing and is dominated by geographic delay (speed of light delay). 703 3.3. Reducing Signaling Load 705 Aggregating traffic into very large hierarchical LSP in the core very 706 substantially reduces the number of LSP that need to be signaled and 707 the number of path computations any given LSR will be required to 708 perform when a network fault occurs. 710 In the extreme, applying MPLS to a very large network without 711 hierarchy could exceed the 20 bit label space. For example, in a 712 network with 4,000 nodes, with 2,000 on either side of a cutset, 713 would have 4,000,000 LSP crossing the cutset. Even in a degree four 714 cutset, an uneven distribution of LSP across the cutset, or the loss 715 of one link would result in a need to exceed the size of the label 716 space. Among provider networks, 4,000 access nodes is not at all 717 large. Hierarchy is an absolute requirement if all access nodes were 718 interconnected in such a network. 720 In less extreme cases, having each node terminate hundreds of LSP to 721 achieve a full mesh creates a very large computational load. 722 Computational complexity is a function of the number of nodes (N) and 723 links (L) in a topology, and the number of LSP that need to be set 724 up. In the common case where L is proportional to N (relatively 725 constant node degree with growth), the time complexity of one CSPF 726 computation is order(N log N). If each node must perform order(N) 727 computations when a fault occurs, then the computational load 728 increases as order(N^2 log N) as the number of nodes increases (where 729 "^" is the power of operator and "N^2" is read "N-squared"). In 730 practice at the time of writing, this imposes a limit of a few 731 hundred nodes in a full mesh of MPLS LSP before the computational 732 load is sufficient to result in unacceptable convergence times. 734 Two solutions are applied to reduce the amount of RSVP-TE signaling. 735 Both involve subdividing the MPLS domain into a core and a set of 736 regions. 738 3.3.1. Reducing Signaling Load using LDP MPTP 740 LDP can be used for edge-to-edge LSP, using RSVP-TE to carry the LDP 741 intra-core traffic and also optionally also using RSVP-TE to carry 742 the LDP intra-region traffic within each region. LDP does not 743 support traffic engineering, but does support multipoint-to-point 744 (MPTP) LSP, which require less signaling than edge-to-edge RSVP-TE 745 point-to-point (PTP) LSP. A drawback of this approach is the 746 inability to use RSVP-TE protection (FRR or GMPLS protection) against 747 failure of the border LSR sitting at a core/region boundary. 749 3.3.2. Reducing Signaling Load using Hierarchy 751 When the number of nodes grows too large, the amount of RSVP-TE 752 signaling can be reduced using the MPLS PSC hierarchy [RFC4206]. A 753 core within the hierarchy can divide the topology into M regions of 754 on average N/M nodes. Within a region the computational load is 755 reduced by more than M^2. Within the core, the computational load 756 generally becomes quite small since M is usually a fairly small 757 number (a few tens of regions) and each region is generally attached 758 to the core in typically only two or three places on average. 760 Using hierarchy improves scaling but has two consequences. First, 761 hierarchy effectively forces the use of platform label space. When a 762 containing LSP is rerouted, the labels assigned to the contained LSP 763 cannot be changed but may arrive on a different interface. Second, 764 hierarchy results in much larger LSP. These LSP today are larger 765 than any single component link and therefore force the use of the 766 all-ones component in link bundles. 768 3.3.3. Using Both LDP MPTP and RSVP-TE Hierarchy 770 It is also possible to use both LDP and RSVP-TE hierarchy. MPLS 771 networks with a very large number of nodes may benefit from the use 772 of both LDP and RSVP-TE hierarchy. The two techniques are certainly 773 not mutually exclusive. 775 3.4. Reducing Forwarding State 777 Both LDP and MPLS hierarchy have the benefit of reducing the amount 778 of forwarding state. Using the example from Section 3.3, and using 779 MPLS hierarchy, the worst case generally occurs at borders with the 780 core. 782 For example, consider a network with approximately 1,000 nodes 783 divided into 10 regions. At the edges, each node requires 1,000 LSP 784 to other edge nodes. The edge nodes also require 100 intra-region 785 LSP. Within the core, if the core has only 3 attachments to each 786 region the core LSR have less than 100 intra-core LSP. At the border 787 cutset between the core and a given region, in this example there are 788 100 edge nodes with inter-region LSP crossing that cutset, destined 789 to 900 other edge nodes. That yields forwarding state for on the 790 order of 90,000 LSP at the border cutset. These same routers need 791 only reroute well under 200 LSP when a multiple fault occurs, as long 792 as only links are affected and a border LSR does not go down. 794 Interior to the core, the forwarding state is greatly reduced. If 795 inter-region LSP have different characteristics, it makes sense to 796 make use of aggregates with different characteristics. Rather than 797 exchange information about every inter-region LSP within the intra- 798 core LSP it makes more sense to use multiple intra-core LSP between 799 pairs of core nodes, each aggregating sets of inter-region LSP with 800 common characteristics or common requirements. 802 3.5. Avoiding Route Oscillation 804 Networks can become unstable when a feedback loop exists such that 805 moving traffic to a link causes a metric such as delay to increase, 806 which then causes traffic to move elsewhere. For example, the 807 original ARPANET routing used a delay based cost metric and proved 808 prone to route oscillations [DBP]. 810 Delay may be used as a constraint in routing for high priority 811 traffic, when this high priority traffic makes a minor contribution 812 to total load, such that the movement of the high priority traffic 813 has a small impact on the delay experienced by other high priority 814 traffic. The safest way to measure delay is to make measurements 815 based on traffic which is prioritized such that it is queued ahead of 816 the lower priority traffic which will be affected if high priority 817 traffic is moved. The amount of high priority traffic must be 818 constrained to consume a fraction of link capacities with the 819 remaining capacity available to lower priority traffic. 821 Any measurement of jitter (delay variation) that is used in route 822 decision is likely to cause oscillation. Jitter that is caused by 823 queuing effects and cannot be measured using a very high priority 824 measurement traffic flow. 826 It may be possible to find links with constrained queuing delay or 827 jitter using a theoretical maximum or a probability based bound on 828 queuing delay or jitter at a given priority based on the types and 829 amounts of traffic accepted and combining that theoretical limit with 830 a measured delay at very high priority. Using delay or jitter as 831 path metrics without creating oscillations is challenging. 833 Instability can occur due to poor performance and interaction with 834 protocol timers. In this way a computational scaling problem can 835 become a stability problem when a network becomes sufficiently large. 837 4. New Challenges 839 New technical challenges are posed by [I-D.ietf-rtgwg-cl-requirement] 840 in both the control plane and data plane. 842 Among the more difficult challenges are the following. 844 1. The requirements related to delay or jitter conflict with 845 requirements for scalability and stability (see Section 4.1.1), 847 2. The combination of ingress control over LSP placement and 848 retaining an ability to move traffic as demands dictate can pose 849 challenges and such requirements can even be conflicting (see 850 Section 4.1.2), 852 3. Path symmetry requires extensions and is particularly challenging 853 for very large LSP (see Section 4.1.3), 855 4. Accommodating a very wide range of requirements among contained 856 LSP can lead to inefficiency if the most stringent requirements 857 are reflected in aggregates, or reduce scalability if a large 858 number of aggregates are used to provide a too fine a reflection 859 of the requirements in the contained LSP (see Section 4.1.4), 861 5. Backwards compatibility is somewhat limited due to the need to 862 accommodate legacy multipath interfaces which provide too little 863 information regarding their configured default behavior, and 864 legacy LSP which provide too little information regarding their 865 LSP requirements (see Section 4.1.5), 867 6. Data plane challenges include those of accommodating very large 868 LSP, large microflows, traffic ordering constraints imposed by a 869 subset of LSP, and accounting for IP and LDP traffic (see 870 Section 4.2). 872 4.1. Control Plane Challenges 874 Some of the control plane requirements are particularly challenging. 875 Handling large flows which aggregate smaller flows must be 876 accomplished with minimal impact on scalability. Potentially 877 conflicting are requirements for jitter and requirements for 878 stability. Potentially conflicting are the requirements for ingress 879 control of a large number of parameters, and the requirements for 880 local control needed to achieve traffic balance across an Advanced 881 Multipath. These challenges and potential solutions are discussed in 882 the following sections. 884 4.1.1. Delay and Jitter Sensitive Routing 886 Delay and jitter sensitive routing are called for in 887 [I-D.ietf-rtgwg-cl-requirement] in requirements FR#2, FR#7, FR#8, 888 FR#9, FR#15, FR#16, FR#17, FR#18. Requirement FR#17 is particularly 889 problematic, calling for constraints on jitter. 891 A tradeoff exists between scaling benefits of aggregating 892 information, and potential benefits of using a finer granularity in 893 delay reporting. To maintain the scaling benefit, measured link 894 delay for any given Advanced Multipath SHOULD be aggregated into a 895 small number of delay ranges. IGP-TE extensions MUST be provided 896 which advertise the available capacities for each of the selected 897 ranges. 899 For path selection of delay sensitive LSP, the ingress SHOULD bias 900 link metrics based on available capacity and select a low cost path 901 which meets LSP total path delay criteria. To communicate the 902 requirements of an LSP, the ERO MUST be extended to indicate the per 903 link constraints. To communicate the type of resource used, the RRO 904 SHOULD be extended to carry an identification of the group that is 905 used to carry the LSP at each link bundle hop. 907 4.1.2. Local Control of Traffic Distribution 909 Many requirements in [I-D.ietf-rtgwg-cl-requirement] suggest that a 910 node immediately adjacent to a component link should have a high 911 degree of control over how traffic is distributed, as long as network 912 performance objectives are met. Particularly relevant are FR#18 and 913 FR#19. 915 The requirements to allow local control are potentially in conflict 916 with requirement FR#21 which gives full control of component link 917 select to the LSP ingress. While supporting this capability is 918 mandatory, use of this feature is optional per LSP. 920 A given network deployment will have to consider this set of 921 conflicting requirements and make appropriate use of local control of 922 traffic placement and ingress control of traffic placement to best 923 meet network requirements. 925 4.1.3. Path Symmetry Requirements 927 Requirement FR#21 in [I-D.ietf-rtgwg-cl-requirement] includes a 928 provision to bind both directions of a bidirectional LSP to the same 929 component. This is easily achieved if the LSP is directly signaled 930 across an Advanced Multipath. This is not as easily achieved if a 931 set of LSP with this requirement are signaled over a large 932 hierarchical LSP which is in turn carried over an Advanced Multipath. 933 The basis for load distribution in such as case is the label stack. 934 The labels in either direction are completely independent. 936 This could be accommodated if the ingress, egress, and all midpoints 937 of the hierarchical LSP make use of an entropy label in the 938 distribution, and the ingress use a fixed value per contained LSP in 939 the entropy label. A solution for this problem may add complexity 940 with very little benefit. There is little or no true benefit of 941 using symmetrical paths rather than component links of identical 942 characteristics. 944 Traffic symmetry and large LSP capacity are a second pair of 945 conflicting requirements. Any given LSP can meet one of these two 946 requirements but not both. A given network deployment will have to 947 make appropriate use of each of these features to best meet network 948 requirements. 950 4.1.4. Requirements for Contained LSP 952 [I-D.ietf-rtgwg-cl-requirement] calls for new LSP constraints. These 953 constraints include frequency of load balancing rearrangement, delay 954 and jitter, packet ordering constraints, and path symmetry. 956 When LSP are contained within hierarchical LSP, there is no signaling 957 available at midpoint LSR which identifies the contained LSP let 958 alone providing the set of requirements unique to each contained LSP. 959 Defining extensions to provide this information would severely impact 960 scalability and defeat the purpose of aggregating control information 961 and forwarding information into hierarchical LSP. For the same 962 scalability reasons, not aggregating at all is not a viable option 963 for large networks where scalability and stability problems may occur 964 as a result. 966 As pointed out in Section 4.1.3, the benefits of supporting symmetric 967 paths among LSP contained within hierarchical LSP may not be 968 sufficient to justify the complexity of supporting this capability. 970 A scalable solution which accommodates multiple sets of LSP between 971 given pairs of LSR is to provide multiple hierarchical LSP for each 972 given pair of LSR, each hierarchical LSP aggregating LSP with common 973 requirements and a common pair of endpoints. This is a network 974 design technique available to the network operator rather than a 975 protocol extension. This technique can accommodate multiple sets of 976 delay and jitter parameters, multiple sets of frequency of load 977 balancing parameters, multiple sets of packet ordering constraints, 978 etc. 980 4.1.5. Retaining Backwards Compatibility 982 Backwards compatibility and support for incremental deployment 983 requires considering the impact of legacy LSR in the role of LSP 984 ingress, and considering the impact of legacy LSR advertising 985 ordinary links, advertising Ethernet LAG as ordinary links, and 986 advertising link bundles. 988 Legacy LSR in the role of LSP ingress cannot signal requirements 989 which are not supported by their control plane software. The 990 additional capabilities supported by other LSR has no impact on these 991 LSR. These LSR however, being unaware of extensions, may try to make 992 use of scarce resources which support specific requirements such as 993 low delay. To a limited extent it may be possible for a network 994 operator to avoid this issue using existing mechanisms such as link 995 administrative attributes and attribute affinities [RFC3209]. 997 Legacy LSR advertising ordinary links will not advertise attributes 998 needed by some LSP. For example, there is no way to determine the 999 delay or jitter characteristics of such a link. Legacy LSR 1000 advertising Ethernet LAG pose additional problems. There is no way 1001 to determine that packet ordering constraints would be violated for 1002 LSP with strict packet ordering constraints, or that frequency of 1003 load balancing rearrangement constraints might be violated. 1005 Legacy LSR advertising link bundles have no way to advertise the 1006 configured default behavior of the link bundle. Some link bundles 1007 may be configured to place each LSP on a single component link and 1008 therefore may not be able to accommodate an LSP which requires 1009 bandwidth in excess of the size of a component link. Some link 1010 bundles may be configured to spread all LSP over the all-ones 1011 component. For LSR using the all-ones component link, there is no 1012 documented procedure for correctly setting the "Maximum LSP 1013 Bandwidth". There is currently no way to indicate the largest 1014 microflow that could be supported by a link bundle using the all-ones 1015 component link. 1017 Having received the RRO, it is possible for an ingress to look for 1018 the all-ones component to identify such link bundles after having 1019 signaled at least one LSP. Whether any LSR collects this information 1020 on legacy LSR and makes use of it to set defaults, is an 1021 implementation choice. 1023 4.2. Data Plane Challenges 1025 Flow identification is briefly discussed in Section 2.1. Traffic 1026 distribution is briefly discussed in Section 2.3. This section 1027 discusses issues specific to particular requirements specified in 1028 [I-D.ietf-rtgwg-cl-requirement]. 1030 4.2.1. Very Large LSP 1032 Very large LSP may exceed the capacity of any single component of an 1033 Advanced Multipath. In some cases contained LSP may exceed the 1034 capacity of any single component. These LSP may make use of the 1035 equivalent of the all-ones component of a link bundle, or may use a 1036 subset of components which meet the LSP requirements. 1038 Very large LSP can be accommodated as long as they can be subdivided 1039 (see Section 4.2.2). A very large LSP cannot have a requirement for 1040 symmetric paths unless complex protocol extensions are proposed (see 1041 Section 2.2 and Section 4.1.3). 1043 4.2.2. Very Large Microflows 1045 Within a very large LSP there may be very large microflows. A very 1046 large microflow is one which cannot be further subdivided and 1047 contributes a very large amount of capacity. Flows which cannot be 1048 subdivided must be no larger that the capacity of any single 1049 component link. 1051 Current signaling provides no way to specify the largest microflow 1052 that a can be supported on a given link bundle in routing 1053 advertisements. Extensions which address this are discussed in 1054 Section 6.4. Absent extensions of this type, traffic containing 1055 microflows that are too large for a given Advanced Multipath may be 1056 present. There is no data plane solution for this problem that would 1057 not require reordering traffic at the Advanced Multipath egress. 1059 Some techniques are susceptible to statistical collisions where an 1060 algorithm to distribute traffic is unable to disambiguate traffic 1061 among two or more very large microflow where their sum is in excess 1062 of the capacity of any single component. Hash based algorithms which 1063 use too small a hash space are particularly susceptible and require a 1064 change in hash seed in the event that this were to occur. A change 1065 in hash seed is highly disruptive, causing traffic reordering among 1066 all traffic flows over which the hash function is applied. 1068 4.2.3. Traffic Ordering Constraints 1070 Some LSP have strict traffic ordering constraints. Most notable 1071 among these are MPLS-TP LSP. In the absence of aggregation into 1072 hierarchical LSP, those LSP with strict traffic ordering constraints 1073 can be placed on individual component links if there is a means of 1074 identifying which LSP have such a constraint. If LSP with strict 1075 traffic ordering constraints are aggregated in hierarchical LSP, the 1076 hierarchical LSP capacity may exceed the capacity of any single 1077 component link. In such a case the load balancing may be constrained 1078 through the use of an entropy label [RFC6790]. This and related 1079 issues are discussed further in Section 6.4. 1081 4.2.4. Accounting for IP and LDP Traffic 1083 Networks which carry RSVP-TE signaled MPLS traffic generally carry 1084 low volumes of native IP traffic, often only carrying control traffic 1085 as native IP. There is no architectural guarantee of this, it is 1086 just how network operators have made use of the protocols. 1088 [I-D.ietf-rtgwg-cl-requirement] requires that native IP and native 1089 LDP be accommodated (DR#2 and DR#3). In some networks, a subset of 1090 services may be carried as native IP or carried as native LDP. Today 1091 this may be accommodated by the network operator estimating the 1092 contribution of IP and LDP and configuring a lower set of available 1093 bandwidth figures on the RSVP-TE advertisements. 1095 The only improvement that Advanced Multipath can offer is that of 1096 measuring the IP and LDP traffic levels and automatically reducing 1097 the available bandwidth figures on the RSVP-TE advertisements. The 1098 measurements would have to be filtered. This is similar to a feature 1099 in existing LSR, commonly known as "autobandwidth" with a key 1100 difference. In the "autobandwidth" feature, the bandwidth request of 1101 an RSVP-TE signaled LSP is adjusted in response to traffic 1102 measurements. In this case the IP or LDP traffic measurements are 1103 used to reduce the link bandwidth directly, without first 1104 encapsulating in an RSVP-TE LSP. 1106 This may be a subtle and perhaps even a meaningless distinction if 1107 Advanced Multipath is used to form a Sub-Path Maintenance Element 1108 (SPME). A SPME is in practice essentially an unsignaled single hop 1109 LSP with PHP enabled [RFC5921]. An Advanced Multipath SPME looks 1110 very much like classic multipath, where there is no signaling, only 1111 management plane configuration creating the multipath entity (of 1112 which Ethernet Link Aggregation is a subset). 1114 4.2.5. IP and LDP Limitations 1116 IP does not offer traffic engineering. LDP cannot be extended to 1117 offer traffic engineering [RFC3468]. Therefore there is no traffic 1118 engineered fallback to an alternate path for IP and LDP traffic if 1119 resources are not adequate for the IP and/or LDP traffic alone on a 1120 given link in the primary path. The only option for IP and LDP would 1121 be to declare the link down. Declaring a link down due to resource 1122 exhaustion would reduce traffic to zero and eliminate the resource 1123 exhaustion. This would cause oscillations and is therefore not a 1124 viable solution. 1126 Congestion caused by IP or LDP traffic loads is a pathologic case 1127 that can occur if IP and/or LDP are carried natively and there is a 1128 high volume of IP or LDP traffic. This situation can be avoided by 1129 carrying IP and LDP within RSVP-TE LSP. 1131 It is also not possible to route LDP traffic differently for 1132 different FEC. LDP traffic engineering is specifically disallowed by 1133 [RFC3468]. It may be possible to support multi-topology IGP 1134 extensions to accommodate more than one set of criteria. If so, the 1135 additional IGP could be bound to the forwarding criteria, and the LDP 1136 FEC bound to a specific IGP instance, inheriting the forwarding 1137 criteria. Alternately, one IGP instance can be used and the LDP SPF 1138 can make use of the constraints, such as delay and jitter, for a 1139 given LDP FEC. 1141 5. Existing Mechanisms 1143 In MPLS the one mechanism which supports explicit signaling of 1144 multiple parallel links is Link Bundling [RFC4201]. The set of 1145 techniques known as "classis multipath" support no explicit 1146 signaling, except in two cases. In Ethernet Link Aggregation the 1147 Link Aggregation Control Protocol (LACP) coordinates the addition or 1148 removal of members from an Ethernet Link Aggregation Group (LAG). 1149 The use of the "all-ones" component of a link bundle indicates use of 1150 classis multipath, however the ability to determine if a link bundle 1151 makes use of classis multipath is not yet supported. 1153 5.1. Link Bundling 1155 Link bundling supports advertisement of a set of homogenous links as 1156 a single route advertisement. Link bundling supports placement of an 1157 LSP on any single component link, or supports placement of an LSP on 1158 the all-ones component link. Not all link bundling implementations 1159 support the all-ones component link. There is no way for an ingress 1160 LSR to tell which potential midpoint LSR support this feature and use 1161 it by default and which do not. Based on [RFC4201] it is unclear how 1162 to advertise a link bundle for which the all-ones component link is 1163 available and used by default. Common practice is to violate the 1164 specification and set the Maximum LSP Bandwidth to the Available 1165 Bandwidth. There is no means to determine the largest microflow that 1166 could be supported by a link bundle that is using the all-ones 1167 component link. 1169 [RFC6107] extends the procedures for hierarchical LSP but also 1170 extends link bundles. An LSP can be explicitly signaled to indicate 1171 that it is an LSP to be used as a component of a link bundle. Prior 1172 to that the common practice was to simply not advertise the component 1173 link LSP into the IGP, since only the ingress and egress of the link 1174 bundle needed to be aware of their existence, which they would be 1175 aware of due to the RSVP-TE signaling used in setting up the 1176 component LSP. 1178 While link bundling can be the basis for Advanced Multipath, a 1179 significant number of small extension needs to be added. 1181 1. To support link bundles of heterogeneous links, a means of 1182 advertising the capacity available within a group of homogeneous 1183 links needs to be provided. 1185 2. Attributes need to be defined to support the following parameters 1186 for the link bundle or for a group of homogeneous links. 1188 A. delay range 1190 B. jitter (delay variation) range 1192 C. group metric 1194 D. all-ones component capable 1196 E. capable of dynamically balancing load 1198 F. largest supportable microflow 1200 G. support for entropy label 1202 3. For each of the prior extended attributes, the constraint based 1203 routing path selection needs to be extended to reflect new 1204 constraints based on the extended attributes. 1206 4. For each of the prior extended attributes, LSP admission control 1207 needs to be extended to reflect new constraints based on the 1208 extended attributes. 1210 5. Dynamic load balance must be provided for flows within a given 1211 set of links with common attributes such that Performance 1212 Objectives are not violated including frequency of load balance 1213 adjustment for any given flow. 1215 5.2. Classic Multipath 1217 Classic multipath is described in [I-D.ietf-rtgwg-cl-use-cases]. 1219 Classic multipath refers to the most common current practice in 1220 implementation and deployment of multipath. The most common current 1221 practice makes use of a hash on the MPLS label stack and if IPv4 or 1222 IPv6 are indicated under the label stack, makes use of the IP source 1223 and destination addresses [RFC4385] [RFC4928]. 1225 Classic multipath provides a highly scalable means of load balancing. 1226 Dynamic multipath has proven value in assuring an even loading on 1227 component link and an ability to adapt to change in offered load that 1228 occurs over periods of hundreds of milliseconds or more. Classic 1229 multipath scalability is due to the ability to effectively work with 1230 an extremely large number of flows (IP host pairs) using relatively 1231 little resources (a data structure accessed using a hash result as a 1232 key or using ranges of hash results). 1234 Classic multipath meets a small subset of Advanced Multipath 1235 requirements. Due to scalability of the approach, classic multipath 1236 seems to be an excellent candidate for extension to meet the full set 1237 of Advanced Multipath forwarding requirements. 1239 Additional detail can be found in [I-D.ietf-rtgwg-cl-use-cases]. 1241 6. Mechanisms Proposed in Other Documents 1243 A number of documents which at the time of writing are works in 1244 progress address parts of the requirements of Advanced Multipath, or 1245 assist in making some of the goals achievable. 1247 6.1. Loss and Delay Measurement 1249 Procedures for measuring loss and delay are provided in [RFC6374]. 1250 These are OAM based measurements. This work could be the basis of 1251 delay measurements and delay variation measurement used for metrics 1252 called for in [I-D.ietf-rtgwg-cl-requirement]. 1254 Currently there are three documents that address delay and delay 1255 variation metrics. 1257 draft-ietf-ospf-te-metric-extensions 1258 [I-D.ietf-ospf-te-metric-extensions] provides a set of OSPF-TE 1259 extension to support delay, jitter, and loss. Stability is not 1260 adequately addressed and some minor issues remain. 1262 I-D.previdi-isis-te-metric-extensions 1263 [I-D.previdi-isis-te-metric-extensions] provides the set of 1264 extensions for ISIS that [I-D.ietf-ospf-te-metric-extensions] 1265 provides for OSPF. This draft mirrors 1266 [I-D.ietf-ospf-te-metric-extensions] sometimes lagging for a 1267 brief period when the OSPF version is updated. 1269 I-D.atlas-mpls-te-express-path 1270 [I-D.atlas-mpls-te-express-path] provides information on the use 1271 of OSPF and ISIS extensions defined in 1272 [I-D.ietf-ospf-te-metric-extensions] and 1273 [I-D.previdi-isis-te-metric-extensions] and a modified CSPF path 1274 selection to meet LSP performance criteria such as minimal delay 1275 paths or bounded delay paths. 1277 Delay variance, loss, residual bandwidth, and available bandwidth 1278 extensions are particular prone to network instability. The question 1279 as to whether queuing delay and delay variation should be considered, 1280 and if so for which diffserv Per-Hop Service Class (PSC) is not 1281 adequately addressed in the current versions of these drafts. These 1282 drafts are actively being discussed and updated and remaining issues 1283 are expected to be resolved. 1285 6.2. Link Bundle Extensions 1287 A set of extension are needed to indicate a group of component links 1288 in the ERO or RRO, where the group is given an interface 1289 identification like the bundle itself. The extensions could also be 1290 further extended to support specification of the all-ones component 1291 link in the ERO or RRO. 1293 [I-D.ospf-cc-stlv] provides a baseline draft for extending link 1294 bundling to advertise components. A new component TLV (C-TLV) is 1295 proposed, which must reference an Advanced Multipath Link TLV. 1296 [I-D.ospf-cc-stlv] is intended for the OSPF WG and submitted for the 1297 "Experimental" track. The 00 version expired in February 2012. A 1298 replacement is expected that will be submitted for consideration on 1299 the standards track. 1301 6.3. Pseudowire Flow and MPLS Entropy Labels 1303 Two documents provide a means to add entropy for the purpose of 1304 improving load balance. MPLS encapsulation can bury information that 1305 is needed to identify microflows. These two documents allow a 1306 pseudowire ingress and LSP ingress respectively to add a label solely 1307 for the purpose of providing a finer granularity of microflow groups. 1309 [RFC6391] allows pseudowires which carry a large volume of traffic, 1310 where microflows can be identified to be load balanced across 1311 multiple members of an Ethernet LAG or an MPLS link bundle. This is 1312 accomplished by adding a flow label below the pseudowire label in the 1313 MPLS label stack. For this to be effective the link bundle load 1314 balance must make use of the label stack up to and including this 1315 flow label. 1317 [RFC6790] provides a means for a LER to put an additional label known 1318 as an entropy label on the MPLS label stack. Only the LER can add 1319 the entropy label. The LER of a PSC LSP would have to add a entropy 1320 label for contained LSPs for which it is a midpoint LSR. 1322 Core LSR acting as LER for aggregated LSP can add entropy labels 1323 based on deep packet inspection and place an entropy label indicator 1324 (ELI) and entropy label (EL) just below the label being acted on. 1325 This would be helpful in situations where the label stack depth to 1326 which load distribution can operate is limited by implementation or 1327 is limited for other reasons such as carrying both MPLS-TP and MPLS 1328 with entropy labels within the same hierarchical LSP. 1330 6.4. Multipath Extensions 1332 The multipath extensions drafts address the issue of accommodating 1333 LSP which have strict packet ordering constraints in a network 1334 containing multipath. MPLS-TP has become the one important instance 1335 of LSP with strict packet ordering constraints and has driven this 1336 work. 1338 [I-D.ietf-mpls-multipath-use] proposed to use MPLS Entropy Label 1339 [RFC6790] to allow MPLS-TP to be carried within MPLS LSP that make 1340 use of multipath. Limitations of this approach in the absence of 1341 protocol extensions is discussed. 1343 [I-D.villamizar-mpls-multipath-extn] provides protocol extensions 1344 needed to overcome the limitations in the absence of protocol 1345 extensions is discussed in [I-D.ietf-mpls-multipath-use]. 1347 7. Required Protocol Extensions and Mechanisms 1349 Prior sections have reviewed key characteristics, architecture 1350 tradeoffs, new challenges, existing mechanisms, and relevant 1351 mechanisms proposed in existing new documents. 1353 This section first summarizes and groups requirements specified in 1354 [I-D.ietf-rtgwg-cl-requirement] (see Section 7.1). A set of 1355 documents coverage groupings are proposed with existing works-in- 1356 progress noted where applicable (see Section 7.2). The set of 1357 extensions are then grouped by protocol affected as a convenience to 1358 implementors (see (see Section 7.3). 1360 7.1. Brief Review of Requirements 1362 The following list provides a categorization of requirements 1363 specified in [I-D.ietf-rtgwg-cl-requirement] along with a short 1364 phrase indication what topic the requirement covers. 1366 routing information aggregation 1367 FR#1 (routing summarization), FR#20 (Advanced Multipath may be a 1368 component of another Advanced Multipath) 1370 restoration speed 1371 FR#2 (restoration speed meeting performance objectives), FR#12 1372 (minimally disruptive load rebalance), DR#6 (fast convergence), 1373 DR#7 (fast worst case failure convergence) 1375 load distribution, stability, minimal disruption 1376 FR#3 (automatic load distribution), FR#5 (must not oscillate), 1377 FR#11 (dynamic placement of flows), FR#12 (minimally disruptive 1378 load rebalance), FR#13 (bounded rearrangement frequency), FR#18 1379 (flow placement must satisfy performance objectives), FR#19 (flow 1380 identification finer than per top level LSP), MR#6 (operator 1381 initiated flow rebalance) 1383 backward compatibility and migration 1384 FR#4 (smooth incremental deployment), FR#6 (management and 1385 diagnostics must continue to function), DR#1 (extend existing 1386 protocols), DR#2 (extend LDP, no LDP TE) 1388 delay and delay variation 1389 FR#7 (expose lower layer measured delay), FR#8 (precision of 1390 latency reporting), FR#9 (limit latency on per LSP basis), FR#15 1391 (minimum delay path), FR#16 (bounded delay path), FR#17 (bounded 1392 jitter path) 1394 admission control, preemption, traffic engineering 1395 FR#10 (admission control, preemption), FR#14 (packet ordering), 1396 FR#21 (ingress specification of path), FR#22 (path symmetry), 1397 DR#3 (IP and LDP traffic), MR#3 (management specification of 1398 path) 1400 single vs multiple domain 1401 DR#4 (IGP extensions allowed within single domain), DR#5 (IGP 1402 extensions disallowed in multiple domain case) 1404 general network management 1405 MR#1 (polling, configuration, and notification), MR#2 (activation 1406 and de-activation) 1408 path determination, connectivity verification 1409 MR#4 (path trace), MR#5 (connectivity verification) 1411 The above list is not intended as a substitute for 1412 [I-D.ietf-rtgwg-cl-requirement], but rather as a concise grouping and 1413 reminder or requirements to serve as a means of more easily 1414 determining requirements coverage of a set of protocol documents. 1416 7.2. Proposed Document Coverage 1418 The primary areas where additional protocol extensions and mechanisms 1419 are required include the topics described in the following 1420 subsections. 1422 There are candidate documents for a subset of the topics below. This 1423 grouping of topics does not require that each topic be addressed by a 1424 separate document. In some cases, a document may cover multiple 1425 topics, or a specific topic may be addressed as applicable in 1426 multiple documents. 1428 7.2.1. Component Link Grouping 1430 An extension to link bundling is needed to specify a group of 1431 components with common attributes. This can be a TLV defined within 1432 the link bundle that carries the same encapsulations as the link 1433 bundle. Two interface indices would be needed for each group. 1435 a. An index is needed that if included in an ERO would indicate the 1436 need to place the LSP on any one component within the group. 1438 b. A second index is needed that if included in an ERO would 1439 indicate the need to balance flows within the LSP across all 1440 components of the group. This is equivalent to the "all-ones" 1441 component for the entire bundle. 1443 [I-D.ospf-cc-stlv] can be extended to include multipath treatment 1444 capabilities. An ISIS solution is also needed. An extension of 1445 RSVP-TE signaling is needed to indicate multipath treatment 1446 preferences. 1448 If a component group is allowed to support all of the parameters of a 1449 link bundle, then a group TE metric would be accommodated. This can 1450 be supported with the component TLV (C-TLV) defined in 1451 [I-D.ospf-cc-stlv]. 1453 The primary focus of this document, among the sets of requirements 1454 listed in Section 7.1 is the "routing information aggregation" set of 1455 requirements. The "restoration speed", "backward compatibility and 1456 migration", and "general network management" requirements must also 1457 be considered. 1459 7.2.2. Delay and Jitter Extensions 1461 A extension is needed in the IGP-TE advertisement to support delay 1462 and delay variation for links, link bundles, and forwarding 1463 adjacencies. Whatever mechanism is described must take precautions 1464 that insure that route oscillations cannot occur. The following set 1465 of drafts address this. 1467 1. [I-D.ietf-ospf-te-metric-extensions] 1469 2. [I-D.previdi-isis-te-metric-extensions] 1470 3. [I-D.atlas-mpls-te-express-path] 1472 The primary focus of this document, among the sets of requirements 1473 listed in Section 7.1 is the "delay and delay variation" set of 1474 requirements. The "restoration speed", "backward compatibility and 1475 migration", and "general network management" requirements must also 1476 be considered. 1478 7.2.3. Path Selection and Admission Control 1480 Path selection and admission control changes must be documented in 1481 each document that proposes a protocol extension that advertises a 1482 new capability or parameter that must be supported by changes in path 1483 selection and admission control. 1485 It would also be helpful to have an informational document which 1486 covers path selection and admission control issues in detail and 1487 briefly summarizes and references the set of documents which propose 1488 extensions. This document could be advanced in parallel with the 1489 protocol extensions. 1491 The primary focus of this document, among the sets of requirements 1492 listed in Section 7.1 are the "load distribution, stability, minimal 1493 disruption" and "admission control, preemption, traffic engineering" 1494 sets of requirements. The "restoration speed" and "path 1495 determination, connectivity verification" requirements must also be 1496 considered. The "backward compatibility and migration", and "general 1497 network management" requirements must also be considered. 1499 7.2.4. Dynamic Multipath Balance 1501 FR#11 explicitly calls for dynamic placement of flows. Load 1502 balancing similar to existing dynamic multipath would satisfy this 1503 requirement. In implementations where flow identification uses a 1504 coarse granularity, the adjustments would have to be equally coarse, 1505 in the worst case moving entire LSP. The impact of flow 1506 identification granularity and potential dynamic multipath approaches 1507 may need to be documented in greater detail than provided here. 1509 The primary focus of this document, among the sets of requirements 1510 listed in Section 7.1 are the "restoration speed" and the "load 1511 distribution, stability, minimal disruption" sets of requirements. 1512 The "path determination, connectivity verification" requirements must 1513 also be considered. The "backward compatibility and migration", and 1514 "general network management" requirements must also be considered. 1516 7.2.5. Frequency of Load Balance 1518 IGP-TE and RSVP-TE extensions are needed to support frequency of load 1519 balancing rearrangement called for in FR#13, and FR#15-FR#17. 1520 Constraints are not defined in RSVP-TE, but could be modeled after 1521 administrative attribute affinities in RFC3209 and elsewhere. 1523 The primary focus of this document, among the sets of requirements 1524 listed in Section 7.1 is the "load distribution, stability, minimal 1525 disruption" set of requirements. The "path determination, 1526 connectivity verification" must also be considered. The "backward 1527 compatibility and migration" and "general network management" 1528 requirements must also be considered. 1530 7.2.6. Inter-Layer Communication 1532 Lower layer to upper layer communication called for in FR#7 and 1533 FR#20. Specific parameters, specifically delay and delay variation, 1534 need to be addressed. Passing information from a lower non-MPLS 1535 layer to an MPLS layer needs to be addressed, though this may largely 1536 be generic advice encouraging a coupling of MPLS to lower layer 1537 management plane or control plane interfaces. This topic can be 1538 addressed in each document proposing a protocol extension, where 1539 applicable. 1541 The primary focus of this document, among the sets of requirements 1542 listed in Section 7.1 is the "restoration speed" set of requirements. 1543 The "backward compatibility and migration" and "general network 1544 management" requirements must also be considered. 1546 7.2.7. Packet Ordering Requirements 1548 A document is needed to define extensions supporting various packet 1549 ordering requirements, ranging from requirements to preserve 1550 microflow ordering only, to requirements to preserve full LSP 1551 ordering (as in MPLS-TP). This is covered by 1552 [I-D.ietf-mpls-multipath-use] and 1553 [I-D.villamizar-mpls-multipath-extn]. 1555 The primary focus of this document, among the sets of requirements 1556 listed in Section 7.1 are the "admission control, preemption, traffic 1557 engineering" and the "path determination, connectivity verification" 1558 sets of requirements. The "backward compatibility and migration" and 1559 "general network management" requirements must also be considered. 1561 7.2.8. Minimally Disruption Load Balance 1563 The behavior of hash methods used in classic multipath needs to be 1564 described in terms of FR#12 which calls for minimally disruptive load 1565 adjustments. For example, reseeding the hash violates FR#12. Using 1566 modulo operations is significantly disruptive if a link comes or goes 1567 down, as pointed out in [RFC2992]. In addition, backwards 1568 compatibility with older hardware needs to be accommodated. 1570 The primary focus of this document, among the sets of requirements 1571 listed in Section 7.1 is the "load distribution, stability, minimal 1572 disruption" set of requirements. 1574 7.2.9. Path Symmetry 1576 Protocol extensions are needed to support dynamic load balance as 1577 called for to meet FR#22 (path symmetry) and to meet FR#11 (dynamic 1578 placement of flows). 1580 Currently path symmetry can only be supported in link bundling if the 1581 path is pinned. When a flow is moved both ingress and egress must 1582 make the move as close to simultaneously as possible to satisfy FR#22 1583 and FR#12 (minimally disruptive load rebalance). There is currently 1584 no protocol to coordinate this move. 1586 If a group of flows are identified using a hash, then the hash must 1587 be identical on the pair of LSR at the endpoint, using the same hash 1588 seed and with one side swapping source and destination. If the label 1589 stack is used, then either the entire label stack must be a special 1590 case flow identification, since the set of labels in either direction 1591 are not correlated, or the two LSR must conspire to use the same flow 1592 identifier. For example, using a common entropy label value, and 1593 using only the entropy label in the flow identification would satisfy 1594 the forwarding requirement. There is no protocol to indicate special 1595 treatment of a label stack within a hierarchical LSP. Adding such a 1596 extension may add significant complexity and ultimately may prove 1597 unscalable. 1599 The primary focus of this document, among the sets of requirements 1600 listed in Section 7.1 are the "load distribution, stability, minimal 1601 disruption" and the "admission control, preemption, traffic 1602 engineering" sets of requirements. The "backward compatibility and 1603 migration" and "general network management" requirements must also be 1604 considered. Path symmetry simplifies support for the "path 1605 determination, connectivity verification" set of requirements, but 1606 with significant complexity added elsewhere. 1608 7.2.10. Performance, Scalability, and Stability 1610 A separate document providing analysis of performance, scalability, 1611 and stability impacts of changes may be needed. The topic of traffic 1612 adjustment oscillation must also be covered. If sufficient coverage 1613 is provided in each document covering a protocol extension, a 1614 separate document would not be needed. 1616 The primary focus of this document, among the sets of requirements 1617 listed in Section 7.1 is the "restoration speed" set of requirements. 1618 This is not a simple topic and not a topic that is well served by 1619 scattering it over multiple documents, therefore it may be best to 1620 put this in a separate document and put citations in documents called 1621 for in Section 7.2.1, Section 7.2.2, Section 7.2.3, Section 7.2.9, 1622 Section 7.2.11, Section 7.2.12, Section 7.2.13, and Section 7.2.14. 1623 Citation may also be helpful in Section 7.2.4, and Section 7.2.5. 1625 7.2.11. IP and LDP Traffic 1627 A document is needed to define the use of measurements of native IP 1628 and native LDP traffic levels which are then used to reduce link 1629 advertised bandwidth amounts. 1631 The primary focus of this document, among the sets of requirements 1632 listed in Section 7.1 are the "load distribution, stability, minimal 1633 disruption" and the "admission control, preemption, traffic 1634 engineering" set of requirements. The "path determination, 1635 connectivity verification" must also be considered. The "backward 1636 compatibility and migration" and "general network management" 1637 requirements must also be considered. 1639 7.2.12. LDP Extensions 1641 Extending LDP is called for in DR#2. LDP can be extended to couple 1642 FEC admission control to local resource availability without 1643 providing LDP traffic engineering capability. Other LDP extensions 1644 such as signaling a bound on microflow size and LDP LSP requirements 1645 would provide useful information without providing LDP traffic 1646 engineering capability. 1648 The primary focus of this document, among the sets of requirements 1649 listed in Section 7.1 is the "admission control, preemption, traffic 1650 engineering" set of requirements. The "backward compatibility and 1651 migration" and "general network management" requirements must also be 1652 considered. 1654 7.2.13. Pseudowire Extensions 1656 Pseudowire (PW) extensions such as signaling a bound on microflow 1657 size and signaling requirements specific to PW would provide useful 1658 information. This information can be carried in the PW LDP signaling 1659 [RFC3985] and the the PW requirements could then be used in a 1660 containing LSP. 1662 The primary focus of this document, among the sets of requirements 1663 listed in Section 7.1 is the "admission control, preemption, traffic 1664 engineering" set of requirements. The "backward compatibility and 1665 migration" and "general network management" requirements must also be 1666 considered. 1668 7.2.14. Multi-Domain Advanced Multipath 1670 DR#5 calls for Advanced Multipath to span multiple network 1671 topologies. Component LSP may already span multiple network 1672 topologies, though most often in practice these are LDP signaled. 1673 Component LSP which are RSVP-TE signaled may also span multiple 1674 network topologies using at least three existing methods (per domain 1675 [RFC5152], BRPC [RFC5441], PCE [RFC4655]). When such component links 1676 are combined in an Advanced Multipath, the Advanced Multipath spans 1677 multiple network topologies. It is not clear in which document this 1678 needs to be described or whether this description in the framework is 1679 sufficient. The authors and/or the WG may need to discuss this. 1680 DR#5 mandates that IGP-TE extension cannot be used. This would 1681 disallow the use of [RFC5316] or [RFC5392] in conjunction with 1682 [RFC5151]. 1684 The primary focus of this document, among the sets of requirements 1685 listed in Section 7.1 are "single vs multiple domain" and "admission 1686 control, preemption, traffic engineering". The "routing information 1687 aggregation" and "load distribution, stability, minimal disruption" 1688 requirements need attention due to their use of the IGP in single 1689 domain Advanced Multipath. Other requirements such as "delay and 1690 delay variation", can more easily be accommodated by carrying metrics 1691 within BGP. The "path determination, connectivity verification" 1692 requirements need attention due to requirements to restrict 1693 disclosure of topology information across domains in multi-domain 1694 deployments. The "backward compatibility and migration" and "general 1695 network management" requirements must also be considered. 1697 7.3. Framework Requirement Coverage by Protocol 1699 As an aid to implementors, this section summarizes requirement 1700 coverage listed in Section 7.2 by protocol or LSR functionality 1701 affected. 1703 Some documentation may be purely informational, proposing no changes 1704 and proposing usage at most. This includes Section 7.2.3, 1705 Section 7.2.8, Section 7.2.10, and Section 7.2.14. 1707 Section 7.2.9 may require a new protocol. 1709 7.3.1. OSPF-TE and ISIS-TE Protocol Extensions 1711 Many of the changes listed in Section 7.2 require IGP-TE changes, 1712 though most are small extensions to provide additional information. 1713 This set includes Section 7.2.1, Section 7.2.2, Section 7.2.5, 1714 Section 7.2.6, and Section 7.2.7. An adjustment to existing 1715 advertised parameters is suggested in Section 7.2.11. 1717 7.3.2. PW Protocol Extensions 1719 The only suggestion of pseudowire (PW) extensions is in 1720 Section 7.2.13. 1722 7.3.3. LDP Protocol Extensions 1724 Potential LDP extensions are described in Section 7.2.12. 1726 7.3.4. RSVP-TE Protocol Extensions 1728 RSVP-TE protocol extensions are called for in Section 7.2.1, 1729 Section 7.2.5, Section 7.2.7, and Section 7.2.9. 1731 7.3.5. RSVP-TE Path Selection Changes 1733 Section 7.2.3 calls for path selection to be addressed in individual 1734 documents that require change. These changes would include those 1735 proposed in Section 7.2.1, Section 7.2.2, Section 7.2.5, and 1736 Section 7.2.7. 1738 7.3.6. RSVP-TE Admission Control and Preemption 1740 When a change is needed to path selection, a corresponding change is 1741 needed in admission control. The same set of sections applies: 1742 Section 7.2.1, Section 7.2.2, Section 7.2.5, and Section 7.2.7. Some 1743 resource changes such as a link delay change might trigger 1744 preemption. The rules of preemption remain unchanged, still based on 1745 holding priority. 1747 7.3.7. Flow Identification and Traffic Balance 1749 The following describe either the state of the art in flow 1750 identification and traffic balance or propose changes: Section 7.2.4, 1751 Section 7.2.5, Section 7.2.7, and Section 7.2.8. 1753 8. IANA Considerations 1755 This is a framework document and therefore does not specify protocol 1756 extensions. This memo includes no request to IANA. 1758 9. Security Considerations 1760 The security considerations for MPLS/GMPLS and for MPLS-TP are 1761 documented in [RFC5920] and [RFC6941]. 1763 The types protocol extensions proposed in this framework document 1764 provide additional information about links, forwarding adjacencies, 1765 and LSP requirements. The protocol semantics changes described in 1766 this framework document propose additional LSP constraints applied at 1767 path computation time and at LSP admission at midpoints LSR. The 1768 additional information and constraints provide no additional security 1769 considerations beyond the security considerations already documented 1770 in [RFC5920] and [RFC6941]. 1772 10. Acknowledgments 1774 Authors would like to thank Adrian Farrel, Fred Jounay, Yuji Kamite 1775 for his extensive comments and suggestions regarding early versions 1776 of this document, Ron Bonica, Nabil Bitar, Eric Gray, Lou Berger, and 1777 Kireeti Kompella for their reviews of early versions and great 1778 suggestions. 1780 Authors would like to thank Iftekhar Hussain for review and 1781 suggestions regarding recent versions of this document. 1783 In the interest of full disclosure of affiliation and in the interest 1784 of acknowledging sponsorship, past affiliations of authors are noted. 1785 Much of the work done by Ning So occurred while Ning was at Verizon. 1786 Much of the work done by Curtis Villamizar occurred while at 1787 Infinera. Infinera continues to sponsor this work on a consulting 1788 basis. 1790 11. References 1791 11.1. Normative References 1793 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1794 Requirement Levels", BCP 14, RFC 2119, March 1997. 1796 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 1797 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 1798 Tunnels", RFC 3209, December 2001. 1800 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 1801 (TE) Extensions to OSPF Version 2", RFC 3630, 1802 September 2003. 1804 [RFC4201] Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling 1805 in MPLS Traffic Engineering (TE)", RFC 4201, October 2005. 1807 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 1808 Hierarchy with Generalized Multi-Protocol Label Switching 1809 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 1811 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 1812 Specification", RFC 5036, October 2007. 1814 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 1815 Engineering", RFC 5305, October 2008. 1817 [RFC5712] Meyer, M. and JP. Vasseur, "MPLS Traffic Engineering Soft 1818 Preemption", RFC 5712, January 2010. 1820 [RFC6107] Shiomoto, K. and A. Farrel, "Procedures for Dynamically 1821 Signaled Hierarchical Label Switched Paths", RFC 6107, 1822 February 2011. 1824 [RFC6374] Frost, D. and S. Bryant, "Packet Loss and Delay 1825 Measurement for MPLS Networks", RFC 6374, September 2011. 1827 [RFC6391] Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan, 1828 J., and S. Amante, "Flow-Aware Transport of Pseudowires 1829 over an MPLS Packet Switched Network", RFC 6391, 1830 November 2011. 1832 11.2. Informative References 1834 [DBP] Bertsekas, D., "Dynamic Behavior of Shortest Path Routing 1835 Algorithms for Communication Networks", IEEE Trans. Auto. 1836 Control 1982. 1838 [I-D.atlas-mpls-te-express-path] 1839 Atlas, A., Drake, J., Giacalone, S., Ward, D., Previdi, 1840 S., and C. Filsfils, "Performance-based Path Selection for 1841 Explicitly Routed LSPs", 1842 draft-atlas-mpls-te-express-path-02 (work in progress), 1843 February 2013. 1845 [I-D.ietf-mpls-multipath-use] 1846 Villamizar, C., "Use of Multipath with MPLS-TP and MPLS", 1847 draft-ietf-mpls-multipath-use-00 (work in progress), 1848 February 2013. 1850 [I-D.ietf-ospf-te-metric-extensions] 1851 Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 1852 Previdi, "OSPF Traffic Engineering (TE) Metric 1853 Extensions", draft-ietf-ospf-te-metric-extensions-04 (work 1854 in progress), June 2013. 1856 [I-D.ietf-rtgwg-cl-requirement] 1857 Villamizar, C., McDysan, D., Ning, S., Malis, A., and L. 1858 Yong, "Requirements for Advanced Multipath in MPLS 1859 Networks", draft-ietf-rtgwg-cl-requirement-11 (work in 1860 progress), July 2013. 1862 [I-D.ietf-rtgwg-cl-use-cases] 1863 Ning, S., Malis, A., McDysan, D., Yong, L., and C. 1864 Villamizar, "Advannced Multipath Use Cases and Design 1865 Considerations", draft-ietf-rtgwg-cl-use-cases-04 (work in 1866 progress), July 2013. 1868 [I-D.ospf-cc-stlv] 1869 Osborne, E., "Component and Composite Link Membership in 1870 OSPF", draft-ospf-cc-stlv-00 (work in progress), 1871 August 2011. 1873 [I-D.previdi-isis-te-metric-extensions] 1874 Previdi, S., Giacalone, S., Ward, D., Drake, J., Atlas, 1875 A., and C. Filsfils, "IS-IS Traffic Engineering (TE) 1876 Metric Extensions", 1877 draft-previdi-isis-te-metric-extensions-03 (work in 1878 progress), February 2013. 1880 [I-D.villamizar-mpls-multipath-extn] 1881 Villamizar, C., "Multipath Extensions for MPLS Traffic 1882 Engineering", draft-villamizar-mpls-multipath-extn-00 1883 (work in progress), November 2012. 1885 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1886 and W. Weiss, "An Architecture for Differentiated 1887 Services", RFC 2475, December 1998. 1889 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 1890 Multicast Next-Hop Selection", RFC 2991, November 2000. 1892 [RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path 1893 Algorithm", RFC 2992, November 2000. 1895 [RFC3260] Grossman, D., "New Terminology and Clarifications for 1896 Diffserv", RFC 3260, April 2002. 1898 [RFC3468] Andersson, L. and G. Swallow, "The Multiprotocol Label 1899 Switching (MPLS) Working Group decision on MPLS signaling 1900 protocols", RFC 3468, February 2003. 1902 [RFC3945] Mannie, E., "Generalized Multi-Protocol Label Switching 1903 (GMPLS) Architecture", RFC 3945, October 2004. 1905 [RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to- 1906 Edge (PWE3) Architecture", RFC 3985, March 2005. 1908 [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, 1909 "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for 1910 Use over an MPLS PSN", RFC 4385, February 2006. 1912 [RFC4448] Martini, L., Rosen, E., El-Aawar, N., and G. Heron, 1913 "Encapsulation Methods for Transport of Ethernet over MPLS 1914 Networks", RFC 4448, April 2006. 1916 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 1917 Element (PCE)-Based Architecture", RFC 4655, August 2006. 1919 [RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal 1920 Cost Multipath Treatment in MPLS Networks", BCP 128, 1921 RFC 4928, June 2007. 1923 [RFC5151] Farrel, A., Ayyangar, A., and JP. Vasseur, "Inter-Domain 1924 MPLS and GMPLS Traffic Engineering -- Resource Reservation 1925 Protocol-Traffic Engineering (RSVP-TE) Extensions", 1926 RFC 5151, February 2008. 1928 [RFC5152] Vasseur, JP., Ayyangar, A., and R. Zhang, "A Per-Domain 1929 Path Computation Method for Establishing Inter-Domain 1930 Traffic Engineering (TE) Label Switched Paths (LSPs)", 1931 RFC 5152, February 2008. 1933 [RFC5316] Chen, M., Zhang, R., and X. Duan, "ISIS Extensions in 1934 Support of Inter-Autonomous System (AS) MPLS and GMPLS 1935 Traffic Engineering", RFC 5316, December 2008. 1937 [RFC5392] Chen, M., Zhang, R., and X. Duan, "OSPF Extensions in 1938 Support of Inter-Autonomous System (AS) MPLS and GMPLS 1939 Traffic Engineering", RFC 5392, January 2009. 1941 [RFC5441] Vasseur, JP., Zhang, R., Bitar, N., and JL. Le Roux, "A 1942 Backward-Recursive PCE-Based Computation (BRPC) Procedure 1943 to Compute Shortest Constrained Inter-Domain Traffic 1944 Engineering Label Switched Paths", RFC 5441, April 2009. 1946 [RFC5920] Fang, L., "Security Framework for MPLS and GMPLS 1947 Networks", RFC 5920, July 2010. 1949 [RFC5921] Bocci, M., Bryant, S., Frost, D., Levrau, L., and L. 1950 Berger, "A Framework for MPLS in Transport Networks", 1951 RFC 5921, July 2010. 1953 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 1954 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 1955 RFC 6790, November 2012. 1957 [RFC6941] Fang, L., Niven-Jenkins, B., Mansfield, S., and R. 1958 Graveman, "MPLS Transport Profile (MPLS-TP) Security 1959 Framework", RFC 6941, April 2013. 1961 Authors' Addresses 1963 So Ning 1964 Tata Communications 1966 Email: ning.so@tatacommunications.com 1968 Dave McDysan 1969 Verizon 1970 22001 Loudoun County PKWY 1971 Ashburn, VA 20147 1972 USA 1974 Email: dave.mcdysan@verizon.com 1975 Eric Osborne 1976 Cisco 1978 Email: eosborne@cisco.com 1980 Lucy Yong 1981 Huawei USA 1982 5340 Legacy Dr. 1983 Plano, TX 75025 1984 USA 1986 Phone: +1 469-277-5837 1987 Email: lucy.yong@huawei.com 1989 Curtis Villamizar 1990 Outer Cape Cod Network Consulting 1992 Email: curtis@occnc.com