idnits 2.17.1 draft-ietf-rtgwg-cl-framework-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 15, 2013) is 3960 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-04) exists of draft-atlas-mpls-te-express-path-02 == Outdated reference: A later version (-04) exists of draft-ietf-mpls-multipath-use-00 == Outdated reference: A later version (-11) exists of draft-ietf-ospf-te-metric-extensions-04 == Outdated reference: A later version (-16) exists of draft-ietf-rtgwg-cl-requirement-08 == Outdated reference: A later version (-06) exists of draft-ietf-rtgwg-cl-use-cases-01 == Outdated reference: A later version (-06) exists of draft-kompella-mpls-rsvp-ecmp-03 -- Obsolete informational reference (is this intentional?): RFC 5316 (Obsoleted by RFC 9346) Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTGWG S. Ning 3 Internet-Draft Tata Communications 4 Intended status: Informational D. McDysan 5 Expires: December 17, 2013 Verizon 6 E. Osborne 7 Cisco 8 L. Yong 9 Huawei USA 10 C. Villamizar 11 Outer Cape Cod Network 12 Consulting 13 June 15, 2013 15 Composite Link Framework in Multi Protocol Label Switching (MPLS) 16 draft-ietf-rtgwg-cl-framework-03 18 Abstract 20 This document specifies a framework for support of composite link in 21 MPLS networks. A composite link consists of a group of homogenous or 22 non-homogenous links that have the same forward adjacency and can be 23 considered as a single TE link or an IP link in routing. A composite 24 link relies on its component links to carry the traffic over the 25 composite link. Applicability is described for a single pair of 26 MPLS-capable nodes, a sequence of MPLS-capable nodes, or a set of 27 layer networks connecting MPLS-capable nodes. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on December 17, 2013. 46 Copyright Notice 48 Copyright (c) 2013 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 64 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 4 65 1.2. Architecture Summary . . . . . . . . . . . . . . . . . . . 4 66 1.3. Conventions used in this document . . . . . . . . . . . . 5 67 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 68 1.5. Document Issues . . . . . . . . . . . . . . . . . . . . . 5 69 2. Composite Link Key Characteristics . . . . . . . . . . . . . . 7 70 2.1. Flow Identification . . . . . . . . . . . . . . . . . . . 7 71 2.2. Composite Link in Control Plane . . . . . . . . . . . . . 10 72 2.3. Composite Link in Data Plane . . . . . . . . . . . . . . . 13 73 3. Architecture Tradeoffs . . . . . . . . . . . . . . . . . . . . 14 74 3.1. Scalability Motivations . . . . . . . . . . . . . . . . . 14 75 3.2. Reducing Routing Information and Exchange . . . . . . . . 15 76 3.3. Reducing Signaling Load . . . . . . . . . . . . . . . . . 16 77 3.3.1. Reducing Signaling Load using LDP . . . . . . . . . . 16 78 3.3.2. Reducing Signaling Load using Hierarchy . . . . . . . 17 79 3.3.3. Using Both LDP and RSVP-TE Hierarchy . . . . . . . . . 17 80 3.4. Reducing Forwarding State . . . . . . . . . . . . . . . . 17 81 3.5. Avoiding Route Oscillation . . . . . . . . . . . . . . . . 18 82 4. New Challenges . . . . . . . . . . . . . . . . . . . . . . . . 18 83 4.1. Control Plane Challenges . . . . . . . . . . . . . . . . . 19 84 4.1.1. Delay and Jitter Sensitive Routing . . . . . . . . . . 19 85 4.1.2. Local Control of Traffic Distribution . . . . . . . . 20 86 4.1.3. Path Symmetry Requirements . . . . . . . . . . . . . . 20 87 4.1.4. Requirements for Contained LSP . . . . . . . . . . . . 21 88 4.1.5. Retaining Backwards Compatibility . . . . . . . . . . 21 89 4.2. Data Plane Challenges . . . . . . . . . . . . . . . . . . 22 90 4.2.1. Very Large LSP . . . . . . . . . . . . . . . . . . . . 22 91 4.2.2. Very Large Microflows . . . . . . . . . . . . . . . . 23 92 4.2.3. Traffic Ordering Constraints . . . . . . . . . . . . . 23 93 4.2.4. Accounting for IP and LDP Traffic . . . . . . . . . . 23 94 4.2.5. IP and LDP Limitations . . . . . . . . . . . . . . . . 24 95 5. Existing Mechanisms . . . . . . . . . . . . . . . . . . . . . 25 96 5.1. Link Bundling . . . . . . . . . . . . . . . . . . . . . . 25 97 5.2. Classic Multipath . . . . . . . . . . . . . . . . . . . . 26 98 6. Mechanisms Proposed in Other Documents . . . . . . . . . . . . 27 99 6.1. Loss and Delay Measurement . . . . . . . . . . . . . . . . 27 100 6.2. Link Bundle Extensions . . . . . . . . . . . . . . . . . . 28 101 6.3. Pseudowire Flow and MPLS Entropy Labels . . . . . . . . . 28 102 6.4. Multipath Extensions . . . . . . . . . . . . . . . . . . . 29 103 7. Required Protocol Extensions and Mechanisms . . . . . . . . . 29 104 7.1. Brief Review of Requirements . . . . . . . . . . . . . . . 29 105 7.2. Proposed Document Coverage . . . . . . . . . . . . . . . . 31 106 7.2.1. Component Link Grouping . . . . . . . . . . . . . . . 31 107 7.2.2. Delay and Jitter Extensions . . . . . . . . . . . . . 31 108 7.2.3. Path Selection and Admission Control . . . . . . . . . 32 109 7.2.4. Dynamic Multipath Balance . . . . . . . . . . . . . . 32 110 7.2.5. Frequency of Load Balance . . . . . . . . . . . . . . 33 111 7.2.6. Inter-Layer Communication . . . . . . . . . . . . . . 33 112 7.2.7. Packet Ordering Requirements . . . . . . . . . . . . . 33 113 7.2.8. Minimally Disruption Load Balance . . . . . . . . . . 34 114 7.2.9. Path Symmetry . . . . . . . . . . . . . . . . . . . . 34 115 7.2.10. Performance, Scalability, and Stability . . . . . . . 35 116 7.2.11. IP and LDP Traffic . . . . . . . . . . . . . . . . . . 35 117 7.2.12. LDP Extensions . . . . . . . . . . . . . . . . . . . . 35 118 7.2.13. Pseudowire Extensions . . . . . . . . . . . . . . . . 36 119 7.2.14. Multi-Domain Composite Link . . . . . . . . . . . . . 36 120 7.3. Framework Requirement Coverage by Protocol . . . . . . . . 36 121 7.3.1. OSPF-TE and ISIS-TE Protocol Extensions . . . . . . . 37 122 7.3.2. PW Protocol Extensions . . . . . . . . . . . . . . . . 37 123 7.3.3. LDP Protocol Extensions . . . . . . . . . . . . . . . 37 124 7.3.4. RSVP-TE Protocol Extensions . . . . . . . . . . . . . 37 125 7.3.5. RSVP-TE Path Selection Changes . . . . . . . . . . . . 37 126 7.3.6. RSVP-TE Admission Control and Preemption . . . . . . . 37 127 7.3.7. Flow Identification and Traffic Balance . . . . . . . 37 128 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 38 129 9. Security Considerations . . . . . . . . . . . . . . . . . . . 38 130 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 38 131 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 39 132 11.1. Normative References . . . . . . . . . . . . . . . . . . . 39 133 11.2. Informative References . . . . . . . . . . . . . . . . . . 39 134 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 42 136 1. Introduction 138 Composite Link functional requirements are specified in 139 [I-D.ietf-rtgwg-cl-requirement]. Composite Link use cases are 140 described in [I-D.ietf-rtgwg-cl-use-cases]. This document specifies 141 a framework to meet these requirements. 143 This document describes a composite link framework in the context of 144 MPLS networks using an IGP-TE and RSVP-TE MPLS control plane with 145 GMPLS extensions [RFC3209] [RFC3630] [RFC3945] [RFC5305]. 147 Specific protocol solutions are outside the scope of this document, 148 however a framework for the extension of existing protocols is 149 provided. Backwards compatibility is best achieved by extending 150 existing protocols where practical rather than inventing new 151 protocols. The focus is on examining where existing protocol 152 mechanisms fall short with respect to [I-D.ietf-rtgwg-cl-requirement] 153 and on the types of extensions that will be required to accommodate 154 functionality that is called for in [I-D.ietf-rtgwg-cl-requirement]. 156 1.1. Background 158 Classic multipath, including Ethernet Link Aggregation has been 159 widely used in today's MPLS networks [RFC4385][RFC4928]. Classic 160 multipath using non-Ethernet links are often advertised using MPLS 161 Link bundling. A link bundle [RFC4201] bundles a group of 162 homogeneous links as a TE link to make IGP-TE information exchange 163 and RSVP-TE signaling more scalable. A composite link allows 164 bundling non-homogenous links together as a single logical link. The 165 motivations for using a composite link are descried in 166 [I-D.ietf-rtgwg-cl-requirement] and [I-D.ietf-rtgwg-cl-use-cases]. 168 A composite link is a single logical link in MPLS network that 169 contains multiple parallel component links between two MPLS LSR. 170 Unlike a link bundle [RFC4201], the component links in a composite 171 link can have different properties such as cost, capacity, delay, or 172 jitter. 174 1.2. Architecture Summary 176 Networks aggregate information, both in the control plane and in the 177 data plane, as a means to achieve scalability. A tradeoff exists 178 between the needs of scalability and the needs to identify differing 179 path and link characteristics and differing requirements among flows 180 contained within further aggregated traffic flows. These tradeoffs 181 are discussed in detail in Section 3. 183 Some aspects of Composite Link requirements present challenges for 184 which multiple solutions may exist. In Section 4 various challenges 185 and potential approaches are discussed. 187 A subset of the functionality called for in 188 [I-D.ietf-rtgwg-cl-requirement] is available through MPLS Link 189 Bundling [RFC4201]. Link bundling and other existing standards 190 applicable to Composite Link are covered in Section 5. 192 The most straightforward means of supporting Composite Link 193 requirements is to extend MPLS protocols and protocol semantics and 194 in particular to extend link bundling. Extensions which have already 195 been proposed in other documents which are applicable to Composite 196 Link are discussed in Section 6. 198 A goal of most new protocol work within IETF is to reuse existing 199 protocol encapsulations and mechanisms where they meet requirements 200 and extend existing mechanisms. This approach minimizes additional 201 complexity while meeting requirements and tends to preserve backwards 202 compatibility to the extent it is practical to do so. These goals 203 are considered in proposing a framework for further protocol 204 extensions and mechanisms in Section 7. 206 1.3. Conventions used in this document 208 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 209 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 210 document are to be interpreted as described in RFC 2119 [RFC2119]. 212 1.4. Terminology 214 Terminology defined in [I-D.ietf-rtgwg-cl-requirement] is used in 215 this document. 217 The abbreviation IGP-TE is used as a shorthand indicating either 218 OSPF-TE [RFC3630] or ISIS-TE [RFC5305]. 220 1.5. Document Issues 222 This subsection exists solely for the purpose of focusing the RTGWG 223 meeting and mailing list discussions on areas within this document 224 that need attention in order for the document to achieve the level of 225 quality necessary to advance the document through the IETF process. 226 This subsection will be removed before work group last call. 228 The following issues need to be resolved. 230 1. The feasibility of symmetric paths for all flows is questionable. 231 The only case where this is practical is where LSP are smaller 232 than component links and where classic link bundling (not using 233 the all-ones component) is used. Perhaps the emphasis on this 234 (mis)feature should be reduced in the requirements document. See 235 Section 4.1.3. 237 2. There is a tradeoff between supporting delay optimized routing 238 and avoiding oscillation. This may be sufficiently covered, but 239 a careful review by others and comments would be beneficial. 241 3. Any measurement of jitter (delay variation) that is used in route 242 decision is likely to cause oscillation. Trying to optimize a 243 path to reduce jitter may be a fools errand. How do we say this 244 in the draft or does the existing text cover it adequately? 246 4. RTGWG needs to consider the possibility of using multi-topology 247 IGP extensions in IP and LDP routing where the topologies reflect 248 differing requirements (see Section 4.2.5). This idea is similar 249 to TOS routing, which has been discussed for decades but has 250 never been deployed. One possible outcome of discussion would be 251 to declare TOS routing out of scope. 253 5. The following referenced drafts have expired: 255 A. [I-D.ospf-cc-stlv] 257 B. [I-D.villamizar-mpls-multipath-extn] 259 A replacement for [I-D.ospf-cc-stlv] is expected to be submitted. 260 [I-D.villamizar-mpls-multipath-extn] is expected to emerge in a 261 simplified form, removing extensions for which existing 262 workarounds are considered adequate based on feedback at a prior 263 IETF. 265 6. Clarification of what we intend to do with Multi-Domain Composite 266 Link is needed in Section 7.2.14. 268 7. The following topics in the requirements document are not 269 addressed. Since they are explicitly mentioned in the 270 requirements document some mention of how they are supported is 271 needed, even if to say nothing needs to be done. If we conclude 272 any particular topic is irrelevant, maybe the topic should be 273 removed from the requirement document. At that point we could 274 add the management requirements that have come up and were 275 missed. 277 A. L3VPN RFC 4364, RFC 4797, L2VPN RFC 4664, VPWS, VPLS RFC 278 4761, RFC 4762 and VPMS VPMS Framework 279 (draft-ietf-l2vpn-vpms-frmwk-requirements) are referenced in 280 the requirements document "Assumptions" section. It is not 281 clear what additional Composite Link requirements these 282 references imply, if any. If no additional requirements are 283 implied, then these references are considered to be 284 informational only. 286 B. Migration (incremental deployment) may not be adequately 287 covered in Section 4.1.5. It might also be necessary to say 288 more here on performance, scalability, and stability as it 289 related to migration. Comments on this from co-authors or 290 the WG? 292 C. We may need a performance section in this document to 293 specifically address #DR6 (fast convergence), and #DR7 (fast 294 worst case failure convergence). We do already have 295 scalability discussion and make a recommendation for a 296 separate document. At the very least the performance section 297 would have to say "no worse than before, except were there 298 was no alternative to make it very slightly worse" (in a bit 299 more detail than that). It might also be helpful to better 300 define the nature of the performance criteria implied by #DR6 301 and #DR7. 303 The above list of issues are to be discussed at the upcoming IETF-85 304 meeting. Hopefully some of the issues can be resolved at the meeting 305 or on the RTGWG mailing list. 307 2. Composite Link Key Characteristics 309 [I-D.ietf-rtgwg-cl-requirement] defines external behavior of 310 Composite Links. The overall framework approach involves extending 311 existing protocols in a backwards compatible manner and reusing 312 ongoing work elsewhere in IETF where applicable, defining new 313 protocols or semantics only where necessary. Given the requirements, 314 and this approach of extending MPLS, Composite Link key 315 characteristics can be described in greater detail than given 316 requirements alone. 318 2.1. Flow Identification 320 Traffic mapping to component links is a data plane operation. 321 Control over how the mapping is done may be directly dictated or 322 constrained by the control plane or by the management plane. When 323 unconstrained by the control plane or management plane, distribution 324 of traffic is entirely a local matter. Regardless of constraints or 325 lack or constraints, the traffic distribution is required to keep 326 packets belonging to individual flows in sequence and meet QoS 327 criteria specified per LSP by either signaling or management 328 [RFC2475] [RFC3260]. 330 Key objectives of the traffic distribution are to not overload any 331 component link, and to be be able to perform local recovery when a 332 subset of component links fails. 334 The network operator may have other objectives such as placing a 335 bidirectional flow or LSP on the same component link in both 336 direction, bounding delay and/or jitter, composite link energy 337 saving, and etc. These new requirements are described in 338 [I-D.ietf-rtgwg-cl-requirement]. 340 Examples of means to identify a flow may in principle include: 342 1. an LSP identified by an MPLS label, 344 2. a sub-LSP [I-D.kompella-mpls-rsvp-ecmp] identified by an MPLS 345 label, 347 3. a pseudowire (PW) [RFC3985] identified by an MPLS PW label, 349 4. a flow or group of flows within a pseudowire (PW) [RFC6391] 350 identified by an MPLS flow label, 352 5. a flow or flow group in an LSP [RFC6790] identified by an MPLS 353 entropy label, 355 6. all traffic between a pair of IP hosts, identified by an IP 356 source and destination pair, 358 7. a specific connection between a pair of IP hosts, identified by 359 an IP source and destination pair, protocol, and protocol port 360 pair, 362 8. a layer-2 conversation within a pseudowire (PW), where the 363 identification is PW payload type specific, such as Ethernet MAC 364 addresses and VLAN tags within an Ethernet PW [RFC4448]. This is 365 feasible but not practical (see below). 367 Although in principle a layer-2 conversation within a pseudowire 368 (PW), may be identified by PW payload type specific information, in 369 practice this is impractical at LSP midpoints when PW are carried. 370 The PW ingress may provide equivalent information in a PW flow label 371 [RFC6391]. Therefore, in practice, item #8 above is covered by 372 [RFC6391] and may be dropped from the list. 374 An LSR must at least be capable of identifying flows based on MPLS 375 labels. Most MPLS LSP do not require that traffic carried by the LSP 376 are carried in order. MPLS-TP is a recent exception. If it is 377 assumed that no LSP require strict packet ordering of the LSP itself 378 (only of flows within the LSP), then the entire label stack can be 379 used as flow identification. If some LSP may require strict packet 380 ordering but those LSP cannot be distinguished from others, then only 381 the top label can be used as a flow identifier. If only the top 382 label is used (for example, as specified by [RFC4201] when the "all- 383 ones" component described in [RFC4201] is not used), then there may 384 not be adequate flow granularity to accomplish well balanced traffic 385 distribution and it will not be possible to carry LSP that are larger 386 than any individual component link. 388 The number of flows can be extremely large. This may be the case 389 when the entire label stack is used and is always the case when IP 390 addresses are used in provider networks carrying Internet traffic. 391 Current practice for native IP load balancing at the time of writing 392 were documented in [RFC2991] and [RFC2992]. These practices as 393 described, make use of IP addresses. 395 The common practices described in [RFC2991] and [RFC2992] were 396 extended to include the MPLS label stack and the common practice of 397 looking at IP addresses within the MPLS payload. These extended 398 practices require that pseudowires use a PWE3 Control Word and are 399 described in [RFC4385] and [RFC4928]. Additional detail on current 400 multipath practices can be found in the appendices of 401 [I-D.ietf-rtgwg-cl-use-cases]. 403 Using only the top label supports too coarse a traffic balance. 404 Prior to MPLS Entropy Label [RFC6790] using the full label stack was 405 also too coarse. Using the full label stack and IP addresses as flow 406 identification provides a sufficiently fine traffic balance, but is 407 capable of identifying such a high number of distinct flows, that a 408 technique of grouping flows, such as hashing on the flow 409 identification criteria, becomes essential to reduce the stored 410 state, and is an essential scaling technique. Other means of 411 grouping flows may be possible. 413 In summary: 415 1. Load balancing using only the MPLS label stack provides too 416 coarse a granularity of load balance. 418 2. Tracking every flow is not scalable due to the extremely large 419 number of flows in provider networks. 421 3. Existing techniques, IP source and destination hash in 422 particular, have proven in over two decades of experience to be 423 an excellent way of identifying groups of flows. 425 4. If a better way to identify groups of flows is discovered, then 426 that method can be used. 428 5. IP address hashing is not required, but use of this technique is 429 strongly encouraged given the technique's long history of 430 successful deployment. 432 MPLS Entropy Label [RFC6790] provides a means of the entropy from 433 information that would require deeper packet inspection, such as 434 inspection of IP addresses, and putting that entropy in the form of a 435 hashed value into the label stack. Midpoint LSR that understand the 436 Entropy Label Indicator can make use of only label stack information 437 but still obtain a fine load balance granularity. 439 2.2. Composite Link in Control Plane 441 A composite Link is advertised as a single logical interface between 442 two connected routers, which forms forwarding adjacency (FA) between 443 the routers. The FA is advertised as a TE-link in a link state IGP, 444 using either OSPF-TE or ISIS-TE. The IGP-TE advertised interface 445 parameters for the composite link can be preconfigured by the network 446 operator or be derived from its component links. Composite link 447 advertisement requirements are specified in 448 [I-D.ietf-rtgwg-cl-requirement]. 450 In IGP-TE, a composite link is advertised as a single TE link between 451 two connected routers. This is similar to a link bundle [RFC4201]. 452 Link bundle applies to a set of homogenous component links. 453 Composite link allows homogenous and non-homogenous component links. 454 Due to the similarity, and for backwards compatibility, extending 455 link bundling is viewed as both simple and as the best approach. 457 In order for a route computation engine to calculate a proper path 458 for a LSP, it is necessary for composite link to advertise the 459 summarized available bandwidth as well as the maximum bandwidth that 460 can be made available for single flow (or single LSP where no finer 461 flow identification is available). If a composite link contains some 462 non-homogeneous component links, the composite link also should 463 advertise the summarized bandwidth and the maximum bandwidth for 464 single flow per each homogeneous component link group. 466 Both LDP [RFC5036] and RSVP-TE [RFC3209] can be used to signal a LSP 467 over a composite link. LDP cannot be extended to support traffic 468 engineering capabilities [RFC3468]. 470 When an LSP is signaled using RSVP-TE, the LSP MUST be placed on the 471 component link that meets the LSP criteria indicated in the signaling 472 message. 474 When an LSP is signaled using LDP, the LSP MUST be placed on the 475 component link that meets the LSP criteria, if such a component link 476 is available. LDP does not support traffic engineering capabilities, 477 imposing restrictions on LDP use of Composite Link. See 478 Section 4.2.5 for further details. 480 If the composite link solution is based on extensions to IGP-TE and 481 RSVP-TE, then in order to meet requirements defined in 482 [I-D.ietf-rtgwg-cl-requirement], the following derived requirements 483 MUST be met. 485 1. A composite link MAY contain non-homogeneous component links. 486 The route computing engine MAY select one group of component 487 links for a LSP. The The route computing engine MUST accommodate 488 service objectives for a given LSP when selecting a group of 489 component links for a LSP. 491 2. The routing protocol MUST make a grouping of component links 492 available in the TE-LSDB, such that within each group all of the 493 component links have similar characteristics (the component links 494 are homogeneous within a group). 496 3. The route computation used in RSVP-TE MUST be extended to include 497 only the capacity of groups within a composite link which meet 498 LSP criteria. 500 4. The signaling protocol MUST be able to indicate either the 501 criteria, or which groups may be used. 503 5. A composite link MUST place each LSP on a component link or group 504 which meets or exceeds the LSP criteria. 506 Composite link capacity is aggregated capacity. LSP capacity MAY be 507 larger than individual component link capacity. Any aggregated LSP 508 can determine a bounds on the largest microflow that could be carried 509 and this constraint can be handled as follows. 511 1. If no information is available through signaling, management 512 plane, or configuration, the largest microflow is bound by one of 513 the following: 515 A. the largest single LSP if most traffic is RSVP-TE signaled 516 and further aggregated, 518 B. the largest pseudowire if most traffic is carrying pseudowire 519 payloads that are aggregated within RSVP-TE LSP, 521 C. or the largest interface or component lisk capacity carrying 522 IP or LDP if a large amount of IP or LDP traffic is contained 523 within the aggregate. 525 If a very large amount of traffic being aggregated is IP or LDP, 526 then the largest microflow is bound by the largest component link 527 on which IP traffic can arrive. For example, if an LSR is acting 528 as an LER and IP and LDP traffic is arriving on 10 Gb/s edge 529 interfaces, then no microflow larger than 10 Gb/s will be present 530 on the RSVP-TE LSP that aggregate traffic across the core, even 531 if the core interfaces are 100 Gb/s interfaces. 533 2. The prior conditions provide a bound on the largest microflow 534 when no signaling extensions indicate a bounds. If an LSP is 535 aggregating smaller LSP for which the largest expected microflow 536 carried by the smaller LSP is signaled, then the largest 537 microflow expected in the containing LSP (the aggregate) is the 538 maximum of the largest expected microflow for any contained LSP. 539 For example, RSVP-TE LSP may be large but aggregate traffic for 540 which the source or sink are all 1 Gb/s or smaller interfaces 541 (such as in mobile applications in which cell sites backhauls are 542 no larger than 1 Gb/s). If this information is carried in the 543 LSP originated at the cell sites, then further aggregates across 544 a core may make use of this information. 546 3. The IGP must provide the bounds on the largest microflow that a 547 composite link can accommodate, which is the maximum capacity on 548 a component link that can be made available by moving other 549 traffic. This information is needed by the ingress LER for path 550 determination. 552 4. A means to signal an LSP whose capacity is larger than individual 553 component link capacity is needed [I-D.ietf-rtgwg-cl-requirement] 554 and also signal the largest microflow expected to be contained in 555 the LSP. If a bounds on the largest microflow is not signaled 556 there is no means to determine if an LSP which is larger than any 557 component link can be subdivided into flows and therefore should 558 be accepted by admission control. 560 When a bidirectional LSP request is signaled over a composite link, 561 if the request indicates that the LSP must be placed on the same 562 component link, the routers of the composite link MUST place the LSP 563 traffic in both directions on a same component link. This is 564 particularly challenging for aggregated capacity which makes use of 565 the label stack for traffic distribution. The two requirements are 566 mutually exclusive for any one LSP. No one LSP may be both larger 567 than any individual component link and require symmetrical paths for 568 every flow. Both requirements can be accommodated by the same 569 composite link for different LSP, with any one LSP requiring no more 570 than one of these two features. 572 Individual component link may fail independently. Upon component 573 link failure, a composite link MUST support a minimally disruptive 574 local repair, preempting any LSP which can no longer be supported. 575 Available capacity in other component links MUST be used to carry 576 impacted traffic. The available bandwidth after failure MUST be 577 advertised immediately to avoid looped crankback. 579 When a composite link is not able to transport all flows, it preempts 580 some flows based upon holding priority and informs the control plane 581 of these preempted flows. To minimize impact on traffic, the 582 composite link MUST support soft preemption [RFC5712]. The network 583 operator SHOULD enable soft preemption. This action ensures the 584 remaining traffic is transported properly. FR#10 requires that the 585 traffic be restored. FR#12 requires that any change be minimally 586 disruptive. These two requirements are interpreted to include 587 preemption among the types of changes that must be minimally 588 disruptive. 590 2.3. Composite Link in Data Plane 592 The data plane must identify groups of flows. Flow identification is 593 covered in Section 2.1. Having identified groups of flows the groups 594 must be placed on individual component links. This step following 595 flow group identification is called traffic distribution or traffic 596 placement. The two steps together are known as traffic balancing or 597 load balancing. 599 Traffic distribution may be determined by or constrained by control 600 plane or management plane. Traffic distribution may be changed due 601 to component link status change, subject to constraints imposed by 602 either the management plane or control plane. The distribution 603 function is local to the routers in which a composite link belongs to 604 and its implementation is not specified here. 606 When performing traffic placement, a composite link does not 607 differentiate multicast traffic vs. unicast traffic. 609 In order to maintain scalability, existing data plane forwarding 610 retains state associated with the top label only. Using UHP (UHP is 611 the absence of the more common PHP), zero of more labels may be POPed 612 and packet and byte counters incremented prior to processing what 613 becomes the top label after the POP operations are completed. Flow 614 group identification may be a parallel step in the forwarding 615 process. Data plane forwarding makes use of the top label to select 616 a composite link, or a group of components within a composite link or 617 for the case where an LSP is pinned (see [RFC4201]), a specific 618 component link. For those LSP for which the LSP selects only the 619 composite link or a group of components within a composite link, the 620 load balancing makes use of the set of component links selected based 621 on the top label, and makes use of the flow group identification to 622 select among that group. 624 The simplest traffic placement techniques uses a modulo operation 625 after computing a hash. This techniques has significant 626 disadvantages. The most common traffic placement techniques uses the 627 a flow group identification as an index into a table. The table 628 provides an indirection. The number of bits of hash is constrained 629 to keep table size small. While this is not the best technique, it 630 is the most common. Better techniques exist but they are outside the 631 scope of this document and some are considered proprietary. 633 Requirements to limit frequency of load balancing can be adhered to 634 by keeping track of when a flow group was last moved and imposing a 635 minimum period before that flow group can be moved again. This is 636 straightforward for a table approach. For other approaches it may be 637 less straightforward. 639 3. Architecture Tradeoffs 641 Scalability and stability are critical considerations in protocol 642 design where protocols may be used in a large network such as today's 643 service provider networks. Composite Link is applicable to networks 644 which are large enough to require that traffic be split over multiple 645 paths. Scalability is a major consideration for networks that reach 646 a capacity large enough to require Composite Link. 648 Some of the requirements of Composite Link could potentially have a 649 negative impact on scalability. This section is about architectural 650 tradeoffs, many motivated by the need to maintain scalability and 651 stability, a need which is reflected in 652 [I-D.ietf-rtgwg-cl-requirement], specifically in DR#6 and DR#7. 654 3.1. Scalability Motivations 656 In the interest of scalability, information is aggregated in 657 situations where information about a large amount of network capacity 658 or a large amount of network demand provides is adequate to meet 659 requirements. Routing information is aggregated to reduce the amount 660 of information exchange related to routing and to simplify route 661 computation (see Section 3.2). 663 In an MPLS network large routing changes can occur when a single 664 fault occurs. For example, a single fault may impact a very large 665 number of LSP traversing a given link. As new LSP are signaled to 666 avoid the fault, resources are consumed elsewhere, and routing 667 protocol announcements must flood the resource changes. If 668 protection is in place, there is less urgency to converging quickly. 669 If multiple faults occur that are not covered by shared risk groups 670 (SRG), then some protection may fail, adding urgency to converging 671 quickly even where protection is deployed. 673 Reducing the amount of information allows the exchange of information 674 during a large routing change to be accomplished more quickly and 675 simplifies route computation. Simplifying route computation improves 676 convergence time after very significant network faults which cannot 677 be handled by preprovisioned or precomputed protection mechanisms. 678 Aggregating smaller LSP into larger LSP is a means to reduce path 679 computation load and reduce RSVP-TE signaling (see Section 3.3). 681 Neglecting scaling issues can result in performance issues, such as 682 slow convergence. Neglecting scaling in some cases can result in 683 networks which perform so poorly as to become unstable. 685 3.2. Reducing Routing Information and Exchange 687 Link bundling provides a means of aggregating control plane 688 information. Even where the all-ones component link supported by 689 link bundling is not used, the amount of control information is 690 reduced by the number of component links in a bundle. 692 Fully deaggregating link bundle information would negate this 693 benefit. If there is a need to deaggregate, such as to distinguish 694 between groups of links within specified ranges of delay, then no 695 more deaggregation than is necessary should be done. 697 For example, in supporting the requirement for heterogeneous 698 component links, it makes little sense to fully deaggregate link 699 bundles when adding support for groups of component links with common 700 attributes within a link bundle can maintain most of the benefit of 701 aggregation while adequately supporting the requirement to support 702 heterogeneous component links. 704 Routing information exchange is also reduced by making sensible 705 choices regarding the amount of change to link parameters that 706 require link readvertisement. For example, if delay measurements 707 include queuing delay, then a much more coarse granularity of delay 708 measurement would be called for than if the delay does not include 709 queuing and is dominated by geographic delay (speed of light delay). 711 3.3. Reducing Signaling Load 713 Aggregating traffic into very large hierarchical LSP in the core very 714 substantially reduces the number of LSP that need to be signaled and 715 the number of path computations any given LSR will be required to 716 perform when a network fault occurs. 718 In the extreme, applying MPLS to a very large network without 719 hierarchy could exceed the 20 bit label space. For example, in a 720 network with 4,000 nodes, with 2,000 on either side of a cutset, 721 would have 4,000,000 LSP crossing the cutset. Even in a degree four 722 cutset, an uneven distribution of LSP across the cutset, or the loss 723 of one link would result in a need to exceed the size of the label 724 space. Among provider networks, 4,000 access nodes is not at all 725 large. Hierarchy is an absolute requirement if all access nodes were 726 interconnected in such a network. 728 In less extreme cases, having each node terminate hundreds of LSP to 729 achieve a full mesh creates a very large computational load. 730 Computational complexity is a function of the number of nodes (N) and 731 links (L) in a topology, and the number of LSP that need to be set 732 up. In the common case where L is proportional to N (relatively 733 constant node degree with growth), the time complexity of one CSPF 734 computation is order(N log N). If each node must perform order(N) 735 computations when a fault occurs, then the computational load 736 increases as order(N^2 log N) as the number of nodes increases (where 737 "^" is the power of operator and "N^2" is read "N-squared"). In 738 practice at the time of writing, this imposes a limit of a few 739 hundred nodes in a full mesh of MPLS LSP before the computational 740 load is sufficient to result in unacceptable convergence times. 742 Two solutions are applied to reduce the amount of RSVP-TE signaling. 743 Both involve subdividing the MPLS domain into a core and a set of 744 regions. 746 3.3.1. Reducing Signaling Load using LDP 748 LDP can be used for edge-to-edge LSP, using RSVP-TE to carry the LDP 749 intra-core traffic and also optionally also using RSVP-TE to carry 750 the LDP intra-region traffic within each region. LDP does not 751 support traffic engineering, but does support multipoint-to-point 752 (MPTP) LSP, which require less signaling than edge-to-edge RSVP-TE 753 point-to-point (PTP) LSP. A drawback of this approach is the 754 inability to use RSVP-TE protection (FRR or GMPLS protection) against 755 failure of the border LSR sitting at a core/region boundary. 757 3.3.2. Reducing Signaling Load using Hierarchy 759 When the number of nodes grows too large, the amount of RSVP-TE 760 signaling can be reduced using the MPLS PSC hierarchy [RFC4206]. A 761 core within the hierarchy can divide the topology into M regions of 762 on average N/M nodes. Within a region the computational load is 763 reduced by more than M^2. Within the core, the computational load 764 generally becomes quite small since M is usually a fairly small 765 number (a few tens of regions) and each region is generally attached 766 to the core in typically only two or three places on average. 768 Using hierarchy improves scaling but has two consequences. First, 769 hierarchy effectively forces the use of platform label space. When a 770 containing LSP is rerouted, the labels assigned to the contained LSP 771 cannot be changed but may arrive on a different interface. Second, 772 hierarchy results in much larger LSP. These LSP today are larger 773 than any single component link and therefore force the use of the 774 all-ones component in link bundles. 776 3.3.3. Using Both LDP and RSVP-TE Hierarchy 778 It is also possible to use both LDP and RSVP-TE hierarchy. MPLS 779 networks with a very large number of nodes may benefit from the use 780 of both LDP and RSVP-TE hierarchy. The two techniques are certainly 781 not mutually exclusive. 783 3.4. Reducing Forwarding State 785 Both LDP and MPLS hierarchy have the benefit of reducing the amount 786 of forwarding state. Using the example from Section 3.3, and using 787 MPLS hierarchy, the worst case generally occurs at borders with the 788 core. 790 For example, consider a network with approximately 1,000 nodes 791 divided into 10 regions. At the edges, each node requires 1,000 LSP 792 to other edge nodes. The edge nodes also require 100 intra-region 793 LSP. Within the core, if the core has only 3 attachments to each 794 region the core LSR have less than 100 intra-core LSP. At the border 795 cutset between the core and a given region, in this example there are 796 100 edge nodes with inter-region LSP crossing that cutset, destined 797 to 900 other edge nodes. That yields forwarding state for on the 798 order of 90,000 LSP at the border cutset. These same routers need 799 only reroute well under 200 LSP when a multiple fault occurs, as long 800 as only links are affected and a border LSR does not go down. 802 Interior to the core, the forwarding state is greatly reduced. If 803 inter-region LSP have different characteristics, it makes sense to 804 make use of aggregates with different characteristics. Rather than 805 exchange information about every inter-region LSP within the intra- 806 core LSP it makes more sense to use multiple intra-core LSP between 807 pairs of core nodes, each aggregating sets of inter-region LSP with 808 common characteristics or common requirements. 810 3.5. Avoiding Route Oscillation 812 Networks can become unstable when a feedback loop exists such that 813 moving traffic to a link causes a metric such as delay to increase, 814 which then causes traffic to move elsewhere. For example, the 815 original ARPANET routing used a delay based cost metric and proved 816 prone to route oscillations [DBP]. 818 Delay may be used as a constraint in routing for high priority 819 traffic, when this high priority traffic makes a minor contribution 820 to total load, such that the movement of the high priority traffic 821 has a small impact on the delay experienced by other high priority 822 traffic. The safest way to measure delay is to make measurements 823 based on traffic which is prioritized such that it is queued ahead of 824 the lower priority traffic which will be affected if high priority 825 traffic is moved. The amount of high priority traffic must be 826 constrained to consume a fraction of link capacities with the 827 remaining capacity available to lower priority traffic. 829 Any measurement of jitter (delay variation) that is used in route 830 decision is likely to cause oscillation. Jitter that is caused by 831 queuing effects and cannot be measured using a very high priority 832 measurement traffic flow. 834 It may be possible to find links with constrained queuing delay or 835 jitter using a theoretical maximum or a probability based bound on 836 queuing delay or jitter at a given priority based on the types and 837 amounts of traffic accepted and combining that theoretical limit with 838 a measured delay at very high priority. Using delay or jitter as 839 path metrics without creating oscillations is challenging. 841 Instability can occur due to poor performance and interaction with 842 protocol timers. In this way a computational scaling problem can 843 become a stability problem when a network becomes sufficiently large. 845 4. New Challenges 847 New technical challenges are posed by [I-D.ietf-rtgwg-cl-requirement] 848 in both the control plane and data plane. 850 Among the more difficult challenges are the following. 852 1. The requirements related to delay or jitter conflict with 853 requirements for scalability and stability (see Section 4.1.1), 855 2. The combination of ingress control over LSP placement and 856 retaining an ability to move traffic as demands dictate can pose 857 challenges and such requirements can even be conflicting (see 858 Section 4.1.2), 860 3. Path symmetry requires extensions and is particularly challenging 861 for very large LSP (see Section 4.1.3), 863 4. Accommodating a very wide range of requirements among contained 864 LSP can lead to inefficiency if the most stringent requirements 865 are reflected in aggregates, or reduce scalability if a large 866 number of aggregates are used to provide a too fine a reflection 867 of the requirements in the contained LSP (see Section 4.1.4), 869 5. Backwards compatibility is somewhat limited due to the need to 870 accommodate legacy multipath interfaces which provide too little 871 information regarding their configured default behavior, and 872 legacy LSP which provide too little information regarding their 873 LSP requirements (see Section 4.1.5), 875 6. Data plane challenges include those of accommodating very large 876 LSP, large microflows, traffic ordering constraints imposed by a 877 subset of LSP, and accounting for IP and LDP traffic (see 878 Section 4.2). 880 4.1. Control Plane Challenges 882 Some of the control plane requirements are particularly challenging. 883 Handling large flows which aggregate smaller flows must be 884 accomplished with minimal impact on scalability. Potentially 885 conflicting are requirements for jitter and requirements for 886 stability. Potentially conflicting are the requirements for ingress 887 control of a large number of parameters, and the requirements for 888 local control needed to achieve traffic balance across a composite 889 link. These challenges and potential solutions are discussed in the 890 following sections. 892 4.1.1. Delay and Jitter Sensitive Routing 894 Delay and jitter sensitive routing are called for in 895 [I-D.ietf-rtgwg-cl-requirement] in requirements FR#2, FR#7, FR#8, 896 FR#9, FR#15, FR#16, FR#17, FR#18. Requirement FR#17 is particularly 897 problematic, calling for constraints on jitter. 899 A tradeoff exists between scaling benefits of aggregating 900 information, and potential benefits of using a finer granularity in 901 delay reporting. To maintain the scaling benefit, measured link 902 delay for any given composite link SHOULD be aggregated into a small 903 number of delay ranges. IGP-TE extensions MUST be provided which 904 advertise the available capacities for each of the selected ranges. 906 For path selection of delay sensitive LSP, the ingress SHOULD bias 907 link metrics based on available capacity and select a low cost path 908 which meets LSP total path delay criteria. To communicate the 909 requirements of an LSP, the ERO MUST be extended to indicate the per 910 link constraints. To communicate the type of resource used, the RRO 911 SHOULD be extended to carry an identification of the group that is 912 used to carry the LSP at each link bundle hop. 914 4.1.2. Local Control of Traffic Distribution 916 Many requirements in [I-D.ietf-rtgwg-cl-requirement] suggest that a 917 node immediately adjacent to a component link should have a high 918 degree of control over how traffic is distributed, as long as network 919 performance objectives are met. Particularly relevant are FR#18 and 920 FR#19. 922 The requirements to allow local control are potentially in conflict 923 with requirement FR#21 which gives full control of component link 924 select to the LSP ingress. While supporting this capability is 925 mandatory, use of this feature is optional per LSP. 927 A given network deployment will have to consider this set of 928 conflicting requirements and make appropriate use of local control of 929 traffic placement and ingress control of traffic placement to best 930 meet network requirements. 932 4.1.3. Path Symmetry Requirements 934 Requirement FR#21 in [I-D.ietf-rtgwg-cl-requirement] includes a 935 provision to bind both directions of a bidirectional LSP to the same 936 component. This is easily achieved if the LSP is directly signaled 937 across a composite link. This is not as easily achieved if a set of 938 LSP with this requirement are signaled over a large hierarchical LSP 939 which is in turn carried over a composite link. The basis for load 940 distribution in such as case is the label stack. The labels in 941 either direction are completely independent. 943 This could be accommodated if the ingress, egress, and all midpoints 944 of the hierarchical LSP make use of an entropy label in the 945 distribution, and the ingress use a fixed value per contained LSP in 946 the entropy label. A solution for this problem may add complexity 947 with very little benefit. There is little or no true benefit of 948 using symmetrical paths rather than component links of identical 949 characteristics. 951 Traffic symmetry and large LSP capacity are a second pair of 952 conflicting requirements. Any given LSP can meet one of these two 953 requirements but not both. A given network deployment will have to 954 make appropriate use of each of these features to best meet network 955 requirements. 957 4.1.4. Requirements for Contained LSP 959 [I-D.ietf-rtgwg-cl-requirement] calls for new LSP constraints. These 960 constraints include frequency of load balancing rearrangement, delay 961 and jitter, packet ordering constraints, and path symmetry. 963 When LSP are contained within hierarchical LSP, there is no signaling 964 available at midpoint LSR which identifies the contained LSP let 965 alone providing the set of requirements unique to each contained LSP. 966 Defining extensions to provide this information would severely impact 967 scalability and defeat the purpose of aggregating control information 968 and forwarding information into hierarchical LSP. For the same 969 scalability reasons, not aggregating at all is not a viable option 970 for large networks where scalability and stability problems may occur 971 as a result. 973 As pointed out in Section 4.1.3, the benefits of supporting symmetric 974 paths among LSP contained within hierarchical LSP may not be 975 sufficient to justify the complexity of supporting this capability. 977 A scalable solution which accommodates multiple sets of LSP between 978 given pairs of LSR is to provide multiple hierarchical LSP for each 979 given pair of LSR, each hierarchical LSP aggregating LSP with common 980 requirements and a common pair of endpoints. This is a network 981 design technique available to the network operator rather than a 982 protocol extension. This technique can accommodate multiple sets of 983 delay and jitter parameters, multiple sets of frequency of load 984 balancing parameters, multiple sets of packet ordering constraints, 985 etc. 987 4.1.5. Retaining Backwards Compatibility 989 Backwards compatibility and support for incremental deployment 990 requires considering the impact of legacy LSR in the role of LSP 991 ingress, and considering the impact of legacy LSR advertising 992 ordinary links, advertising Ethernet LAG as ordinary links, and 993 advertising link bundles. 995 Legacy LSR in the role of LSP ingress cannot signal requirements 996 which are not supported by their control plane software. The 997 additional capabilities supported by other LSR has no impact on these 998 LSR. These LSR however, being unaware of extensions, may try to make 999 use of scarce resources which support specific requirements such as 1000 low delay. To a limited extent it may be possible for a network 1001 operator to avoid this issue using existing mechanisms such as link 1002 administrative attributes and attribute affinities [RFC3209]. 1004 Legacy LSR advertising ordinary links will not advertise attributes 1005 needed by some LSP. For example, there is no way to determine the 1006 delay or jitter characteristics of such a link. Legacy LSR 1007 advertising Ethernet LAG pose additional problems. There is no way 1008 to determine that packet ordering constraints would be violated for 1009 LSP with strict packet ordering constraints, or that frequency of 1010 load balancing rearrangement constraints might be violated. 1012 Legacy LSR advertising link bundles have no way to advertise the 1013 configured default behavior of the link bundle. Some link bundles 1014 may be configured to place each LSP on a single component link and 1015 therefore may not be able to accommodate an LSP which requires 1016 bandwidth in excess of the size of a component link. Some link 1017 bundles may be configured to spread all LSP over the all-ones 1018 component. For LSR using the all-ones component link, there is no 1019 documented procedure for correctly setting the "Maximum LSP 1020 Bandwidth". There is currently no way to indicate the largest 1021 microflow that could be supported by a link bundle using the all-ones 1022 component link. 1024 Having received the RRO, it is possible for an ingress to look for 1025 the all-ones component to identify such link bundles after having 1026 signaled at least one LSP. Whether any LSR collects this information 1027 on legacy LSR and makes use of it to set defaults, is an 1028 implementation choice. 1030 4.2. Data Plane Challenges 1032 Flow identification is briefly discussed in Section 2.1. Traffic 1033 distribution is briefly discussed in Section 2.3. This section 1034 discusses issues specific to particular requirements specified in 1035 [I-D.ietf-rtgwg-cl-requirement]. 1037 4.2.1. Very Large LSP 1039 Very large LSP may exceed the capacity of any single component of a 1040 composite link. In some cases contained LSP may exceed the capacity 1041 of any single component. These LSP may make use of the equivalent of 1042 the all-ones component of a link bundle, or may use a subset of 1043 components which meet the LSP requirements. 1045 Very large LSP can be accommodated as long as they can be subdivided 1046 (see Section 4.2.2). A very large LSP cannot have a requirement for 1047 symmetric paths unless complex protocol extensions are proposed (see 1048 Section 2.2 and Section 4.1.3). 1050 4.2.2. Very Large Microflows 1052 Within a very large LSP there may be very large microflows. A very 1053 large microflow is one which cannot be further subdivided and 1054 contributes a very large amount of capacity. Flows which cannot be 1055 subdivided must be no larger that the capacity of any single 1056 component link. 1058 Current signaling provides no way to specify the largest microflow 1059 that a can be supported on a given link bundle in routing 1060 advertisements. Extensions which address this are discussed in 1061 Section 6.4. Absent extensions of this type, traffic containing 1062 microflows that are too large for a given composite link may be 1063 present. There is no data plane solution for this problem that would 1064 not require reordering traffic at the composite link egress. 1066 Some techniques are susceptible to statistical collisions where an 1067 algorithm to distribute traffic is unable to disambiguate traffic 1068 among two or more very large microflow where their sum is in excess 1069 of the capacity of any single component. Hash based algorithms which 1070 use too small a hash space are particularly susceptible and require a 1071 change in hash seed in the event that this were to occur. A change 1072 in hash seed is highly disruptive, causing traffic reordering among 1073 all traffic flows over which the hash function is applied. 1075 4.2.3. Traffic Ordering Constraints 1077 Some LSP have strict traffic ordering constraints. Most notable 1078 among these are MPLS-TP LSP. In the absence of aggregation into 1079 hierarchical LSP, those LSP with strict traffic ordering constraints 1080 can be placed on individual component links if there is a means of 1081 identifying which LSP have such a constraint. If LSP with strict 1082 traffic ordering constraints are aggregated in hierarchical LSP, the 1083 hierarchical LSP capacity may exceed the capacity of any single 1084 component link. In such a case the load balancing may be constrained 1085 through the use of an entropy label [RFC6790]. This and related 1086 issues are discussed further in Section 6.4. 1088 4.2.4. Accounting for IP and LDP Traffic 1090 Networks which carry RSVP-TE signaled MPLS traffic generally carry 1091 low volumes of native IP traffic, often only carrying control traffic 1092 as native IP. There is no architectural guarantee of this, it is 1093 just how network operators have made use of the protocols. 1095 [I-D.ietf-rtgwg-cl-requirement] requires that native IP and native 1096 LDP be accommodated (DR#2 and DR#3). In some networks, a subset of 1097 services may be carried as native IP or carried as native LDP. Today 1098 this may be accommodated by the network operator estimating the 1099 contribution of IP and LDP and configuring a lower set of available 1100 bandwidth figures on the RSVP-TE advertisements. 1102 The only improvement that Composite Link can offer is that of 1103 measuring the IP and LDP traffic levels and automatically reducing 1104 the available bandwidth figures on the RSVP-TE advertisements. The 1105 measurements would have to be filtered. This is similar to a feature 1106 in existing LSR, commonly known as "autobandwidth" with a key 1107 difference. In the "autobandwidth" feature, the bandwidth request of 1108 an RSVP-TE signaled LSP is adjusted in response to traffic 1109 measurements. In this case the IP or LDP traffic measurements are 1110 used to reduce the link bandwidth directly, without first 1111 encapsulating in an RSVP-TE LSP. 1113 This may be a subtle and perhaps even a meaningless distinction if 1114 Composite Link is used to form a Sub-Path Maintenance Element (SPME). 1115 A SPME is in practice essentially an unsignaled single hop LSP with 1116 PHP enabled [RFC5921]. A Composite Link SPME looks very much like 1117 classic multipath, where there is no signaling, only management plane 1118 configuration creating the multipath entity (of which Ethernet Link 1119 Aggregation is a subset). 1121 4.2.5. IP and LDP Limitations 1123 IP does not offer traffic engineering. LDP cannot be extended to 1124 offer traffic engineering [RFC3468]. Therefore there is no traffic 1125 engineered fallback to an alternate path for IP and LDP traffic if 1126 resources are not adequate for the IP and/or LDP traffic alone on a 1127 given link in the primary path. The only option for IP and LDP would 1128 be to declare the link down. Declaring a link down due to resource 1129 exhaustion would reduce traffic to zero and eliminate the resource 1130 exhaustion. This would cause oscillations and is therefore not a 1131 viable solution. 1133 Congestion caused by IP or LDP traffic loads is a pathologic case 1134 that can occur if IP and/or LDP are carried natively and there is a 1135 high volume of IP or LDP traffic. This situation can be avoided by 1136 carrying IP and LDP within RSVP-TE LSP. 1138 It is also not possible to route LDP traffic differently for 1139 different FEC. LDP traffic engineering is specifically disallowed by 1140 [RFC3468]. It may be possible to support multi-topology IGP 1141 extensions to accommodate more than one set of criteria. If so, the 1142 additional IGP could be bound to the forwarding criteria, and the LDP 1143 FEC bound to a specific IGP instance, inheriting the forwarding 1144 criteria. Alternately, one IGP instance can be used and the LDP SPF 1145 can make use of the constraints, such as delay and jitter, for a 1146 given LDP FEC. 1148 5. Existing Mechanisms 1150 In MPLS the one mechanism which supports explicit signaling of 1151 multiple parallel links is Link Bundling [RFC4201]. The set of 1152 techniques known as "classis multipath" support no explicit 1153 signaling, except in two cases. In Ethernet Link Aggregation the 1154 Link Aggregation Control Protocol (LACP) coordinates the addition or 1155 removal of members from an Ethernet Link Aggregation Group (LAG). 1156 The use of the "all-ones" component of a link bundle indicates use of 1157 classis multipath, however the ability to determine if a link bundle 1158 makes use of classis multipath is not yet supported. 1160 5.1. Link Bundling 1162 Link bundling supports advertisement of a set of homogenous links as 1163 a single route advertisement. Link bundling supports placement of an 1164 LSP on any single component link, or supports placement of an LSP on 1165 the all-ones component link. Not all link bundling implementations 1166 support the all-ones component link. There is no way for an ingress 1167 LSR to tell which potential midpoint LSR support this feature and use 1168 it by default and which do not. Based on [RFC4201] it is unclear how 1169 to advertise a link bundle for which the all-ones component link is 1170 available and used by default. Common practice is to violate the 1171 specification and set the Maximum LSP Bandwidth to the Available 1172 Bandwidth. There is no means to determine the largest microflow that 1173 could be supported by a link bundle that is using the all-ones 1174 component link. 1176 [RFC6107] extends the procedures for hierarchical LSP but also 1177 extends link bundles. An LSP can be explicitly signaled to indicate 1178 that it is an LSP to be used as a component of a link bundle. Prior 1179 to that the common practice was to simply not advertise the component 1180 link LSP into the IGP, since only the ingress and egress of the link 1181 bundle needed to be aware of their existence, which they would be 1182 aware of due to the RSVP-TE signaling used in setting up the 1183 component LSP. 1185 While link bundling can be the basis for composite links, a 1186 significant number of small extension needs to be added. 1188 1. To support link bundles of heterogeneous links, a means of 1189 advertising the capacity available within a group of homogeneous 1190 links needs to be provided. 1192 2. Attributes need to be defined to support the following parameters 1193 for the link bundle or for a group of homogeneous links. 1195 A. delay range 1197 B. jitter (delay variation) range 1199 C. group metric 1201 D. all-ones component capable 1203 E. capable of dynamically balancing load 1205 F. largest supportable microflow 1207 G. support for entropy label 1209 3. For each of the prior extended attributes, the constraint based 1210 routing path selection needs to be extended to reflect new 1211 constraints based on the extended attributes. 1213 4. For each of the prior extended attributes, LSP admission control 1214 needs to be extended to reflect new constraints based on the 1215 extended attributes. 1217 5. Dynamic load balance must be provided for flows within a given 1218 set of links with common attributes such that NPO are not 1219 violated including frequency of load balance adjustment for any 1220 given flow. 1222 5.2. Classic Multipath 1224 Classic multipath is described in [I-D.ietf-rtgwg-cl-use-cases]. 1226 Classic multipath refers to the most common current practice in 1227 implementation and deployment of multipath. The most common current 1228 practice makes use of a hash on the MPLS label stack and if IPv4 or 1229 IPv6 are indicated under the label stack, makes use of the IP source 1230 and destination addresses [RFC4385] [RFC4928]. 1232 Classic multipath provides a highly scalable means of load balancing. 1233 Dynamic multipath has proven value in assuring an even loading on 1234 component link and an ability to adapt to change in offered load that 1235 occurs over periods of hundreds of milliseconds or more. Classic 1236 multipath scalability is due to the ability to effectively work with 1237 an extremely large number of flows (IP host pairs) using relatively 1238 little resources (a data structure accessed using a hash result as a 1239 key or using ranges of hash results). 1241 Classic multipath meets a small subset of Composite Link 1242 requirements. Due to scalability of the approach, classic multipath 1243 seems to be an excellent candidate for extension to meet the full set 1244 of Composite Link forwarding requirements. 1246 Additional detail can be found in [I-D.ietf-rtgwg-cl-use-cases]. 1248 6. Mechanisms Proposed in Other Documents 1250 A number of documents which at the time of writing are works in 1251 progress address parts of the requirements of Composite Link, or 1252 assist in making some of the goals achievable. 1254 6.1. Loss and Delay Measurement 1256 Procedures for measuring loss and delay are provided in [RFC6374]. 1257 These are OAM based measurements. This work could be the basis of 1258 delay measurements and delay variation measurement used for metrics 1259 called for in [I-D.ietf-rtgwg-cl-requirement]. 1261 Currently there are three documents that address delay and delay 1262 variation metrics. 1264 draft-ietf-ospf-te-metric-extensions 1265 [I-D.ietf-ospf-te-metric-extensions] provides a set of OSPF-TE 1266 extension to support delay, jitter, and loss. Stability is not 1267 adequately addressed and some minor issues remain. 1269 I-D.previdi-isis-te-metric-extensions 1270 [I-D.previdi-isis-te-metric-extensions] provides the set of 1271 extensions for ISIS that [I-D.ietf-ospf-te-metric-extensions] 1272 provides for OSPF. This draft mirrors 1273 [I-D.ietf-ospf-te-metric-extensions] sometimes lagging for a 1274 brief period when the OSPF version is updated. 1276 I-D.atlas-mpls-te-express-path 1277 [I-D.atlas-mpls-te-express-path] provides information on the use 1278 of OSPF and ISIS extensions defined in 1279 [I-D.ietf-ospf-te-metric-extensions] and 1280 [I-D.previdi-isis-te-metric-extensions] and a modified CSPF path 1281 selection to meet LSP performance criteria such as minimal delay 1282 paths or bounded delay paths. 1284 Delay variance, loss, residual bandwidth, and available bandwidth 1285 extensions are particular prone to network instability. The question 1286 as to whether queuing delay and delay variation should be considered, 1287 and if so for which diffserv Per-Hop Service Class (PSC) is not 1288 adequately addressed in the current versions of these drafts. These 1289 drafts are actively being discussed and updated and remaining issues 1290 are expected to be resolved. 1292 6.2. Link Bundle Extensions 1294 A set of extension are needed to indicate a group of component links 1295 in the ERO or RRO, where the group is given an interface 1296 identification like the bundle itself. The extensions could also be 1297 further extended to support specification of the all-ones component 1298 link in the ERO or RRO. 1300 [I-D.ospf-cc-stlv] provides a baseline draft for extending link 1301 bundling to advertise components. A new component TLV (C-TLV) is 1302 proposed, which must reference a Composite Link Link TLV. 1303 [I-D.ospf-cc-stlv] is intended for the OSPF WG and submitted for the 1304 "Experimental" track. The 00 version expired in February 2012. A 1305 replacement is expected that will be submitted for consideration on 1306 the standards track. 1308 6.3. Pseudowire Flow and MPLS Entropy Labels 1310 Two documents provide a means to add entropy for the purpose of 1311 improving load balance. MPLS encapsulation can bury information that 1312 is needed to identify microflows. These two documents allow a 1313 pseudowire ingress and LSP ingress respectively to add a label solely 1314 for the purpose of providing a finer granularity of microflow groups. 1316 [RFC6391] allows pseudowires which carry a large volume of traffic, 1317 where microflows can be identified to be load balanced across 1318 multiple members of an Ethernet LAG or an MPLS link bundle. This is 1319 accomplished by adding a flow label below the pseudowire label in the 1320 MPLS label stack. For this to be effective the link bundle load 1321 balance must make use of the label stack up to and including this 1322 flow label. 1324 [RFC6790] provides a means for a LER to put an additional label known 1325 as an entropy label on the MPLS label stack. Only the LER can add 1326 the entropy label. The LER of a PSC LSP would have to add a entropy 1327 label for contained LSPs for which it is a midpoint LSR. 1329 Core LSR acting as LER for aggregated LSP can add entropy labels 1330 based on deep packet inspection and place an entropy label indicator 1331 (ELI) and entropy label (EL) just below the label being acted on. 1333 This would be helpful in situations where the label stack depth to 1334 which load distribution can operate is limited by implementation or 1335 is limited for other reasons such as carrying both MPLS-TP and MPLS 1336 with entropy labels within the same hierarchical LSP. 1338 6.4. Multipath Extensions 1340 The multipath extensions drafts address the issue of accommodating 1341 LSP which have strict packet ordering constraints in a network 1342 containing multipath. MPLS-TP has become the one important instance 1343 of LSP with strict packet ordering constraints and has driven this 1344 work. 1346 [I-D.ietf-mpls-multipath-use] proposed to use MPLS Entropy Label 1347 [RFC6790] to allow MPLS-TP to be carried within MPLS LSP that make 1348 use of multipath. Limitations of this approach in the absence of 1349 protocol extensions is discussed. 1351 [I-D.villamizar-mpls-multipath-extn] provides protocol extensions 1352 needed to overcome the limitations in the absence of protocol 1353 extensions is discussed in [I-D.ietf-mpls-multipath-use]. 1355 7. Required Protocol Extensions and Mechanisms 1357 Prior sections have reviewed key characteristics, architecture 1358 tradeoffs, new challenges, existing mechanisms, and relevant 1359 mechanisms proposed in existing new documents. 1361 This section first summarizes and groups requirements specified in 1362 [I-D.ietf-rtgwg-cl-requirement] (see Section 7.1). A set of 1363 documents coverage groupings are proposed with existing works-in- 1364 progress noted where applicable (see Section 7.2). The set of 1365 extensions are then grouped by protocol affected as a convenience to 1366 implementors (see (see Section 7.3). 1368 7.1. Brief Review of Requirements 1370 The following list provides a categorization of requirements 1371 specified in [I-D.ietf-rtgwg-cl-requirement] along with a short 1372 phrase indication what topic the requirement covers. 1374 routing information aggregation 1375 FR#1 (routing summarization), FR#20 (composite link may be a 1376 component of another composite link) 1378 restoration speed 1379 FR#2 (restoration speed meeting NPO), FR#12 (minimally disruptive 1380 load rebalance), DR#6 (fast convergence), DR#7 (fast worst case 1381 failure convergence) 1383 load distribution, stability, minimal disruption 1384 FR#3 (automatic load distribution), FR#5 (must not oscillate), 1385 FR#11 (dynamic placement of flows), FR#12 (minimally disruptive 1386 load rebalance), FR#13 (bounded rearrangement frequency), FR#18 1387 (flow placement must satisfy NPO), FR#19 (flow identification 1388 finer than per top level LSP), MR#6 (operator initiated flow 1389 rebalance) 1391 backward compatibility and migration 1392 FR#4 (smooth incremental deployment), FR#6 (management and 1393 diagnostics must continue to function), DR#1 (extend existing 1394 protocols), DR#2 (extend LDP, no LDP TE) 1396 delay and delay variation 1397 FR#7 (expose lower layer measured delay), FR#8 (precision of 1398 latency reporting), FR#9 (limit latency on per LSP basis), FR#15 1399 (minimum delay path), FR#16 (bounded delay path), FR#17 (bounded 1400 jitter path) 1402 admission control, preemption, traffic engineering 1403 FR#10 (admission control, preemption), FR#14 (packet ordering), 1404 FR#21 (ingress specification of path), FR#22 (path symmetry), 1405 DR#3 (IP and LDP traffic), MR#3 (management specification of 1406 path) 1408 single vs multiple domain 1409 DR#4 (IGP extensions allowed within single domain), DR#5 (IGP 1410 extensions disallowed in multiple domain case) 1412 general network management 1413 MR#1 (polling, configuration, and notification), MR#2 (activation 1414 and de-activation) 1416 path determination, connectivity verification 1417 MR#4 (path trace), MR#5 (connectivity verification) 1419 The above list is not intended as a substitute for 1420 [I-D.ietf-rtgwg-cl-requirement], but rather as a concise grouping and 1421 reminder or requirements to serve as a means of more easily 1422 determining requirements coverage of a set of protocol documents. 1424 7.2. Proposed Document Coverage 1426 The primary areas where additional protocol extensions and mechanisms 1427 are required include the topics described in the following 1428 subsections. 1430 There are candidate documents for a subset of the topics below. This 1431 grouping of topics does not require that each topic be addressed by a 1432 separate document. In some cases, a document may cover multiple 1433 topics, or a specific topic may be addressed as applicable in 1434 multiple documents. 1436 7.2.1. Component Link Grouping 1438 An extension to link bundling is needed to specify a group of 1439 components with common attributes. This can be a TLV defined within 1440 the link bundle that carries the same encapsulations as the link 1441 bundle. Two interface indices would be needed for each group. 1443 a. An index is needed that if included in an ERO would indicate the 1444 need to place the LSP on any one component within the group. 1446 b. A second index is needed that if included in an ERO would 1447 indicate the need to balance flows within the LSP across all 1448 components of the group. This is equivalent to the "all-ones" 1449 component for the entire bundle. 1451 [I-D.ospf-cc-stlv] can be extended to include multipath treatment 1452 capabilities. An ISIS solution is also needed. An extension of 1453 RSVP-TE signaling is needed to indicate multipath treatment 1454 preferences. 1456 If a component group is allowed to support all of the parameters of a 1457 link bundle, then a group TE metric would be accommodated. This can 1458 be supported with the component TLV (C-TLV) defined in 1459 [I-D.ospf-cc-stlv]. 1461 The primary focus of this document, among the sets of requirements 1462 listed in Section 7.1 is the "routing information aggregation" set of 1463 requirements. The "restoration speed", "backward compatibility and 1464 migration", and "general network management" requirements must also 1465 be considered. 1467 7.2.2. Delay and Jitter Extensions 1469 A extension is needed in the IGP-TE advertisement to support delay 1470 and delay variation for links, link bundles, and forwarding 1471 adjacencies. Whatever mechanism is described must take precautions 1472 that insure that route oscillations cannot occur. The following set 1473 of drafts address this. 1475 1. [I-D.ietf-ospf-te-metric-extensions] 1477 2. [I-D.previdi-isis-te-metric-extensions] 1479 3. [I-D.atlas-mpls-te-express-path] 1481 The primary focus of this document, among the sets of requirements 1482 listed in Section 7.1 is the "delay and delay variation" set of 1483 requirements. The "restoration speed", "backward compatibility and 1484 migration", and "general network management" requirements must also 1485 be considered. 1487 7.2.3. Path Selection and Admission Control 1489 Path selection and admission control changes must be documented in 1490 each document that proposes a protocol extension that advertises a 1491 new capability or parameter that must be supported by changes in path 1492 selection and admission control. 1494 It would also be helpful to have an informational document which 1495 covers path selection and admission control issues in detail and 1496 briefly summarizes and references the set of documents which propose 1497 extensions. This document could be advanced in parallel with the 1498 protocol extensions. 1500 The primary focus of this document, among the sets of requirements 1501 listed in Section 7.1 are the "load distribution, stability, minimal 1502 disruption" and "admission control, preemption, traffic engineering" 1503 sets of requirements. The "restoration speed" and "path 1504 determination, connectivity verification" requirements must also be 1505 considered. The "backward compatibility and migration", and "general 1506 network management" requirements must also be considered. 1508 7.2.4. Dynamic Multipath Balance 1510 FR#11 explicitly calls for dynamic placement of flows. Load 1511 balancing similar to existing dynamic multipath would satisfy this 1512 requirement. In implementations where flow identification uses a 1513 coarse granularity, the adjustments would have to be equally coarse, 1514 in the worst case moving entire LSP. The impact of flow 1515 identification granularity and potential dynamic multipath approaches 1516 may need to be documented in greater detail than provided here. 1518 The primary focus of this document, among the sets of requirements 1519 listed in Section 7.1 are the "restoration speed" and the "load 1520 distribution, stability, minimal disruption" sets of requirements. 1521 The "path determination, connectivity verification" requirements must 1522 also be considered. The "backward compatibility and migration", and 1523 "general network management" requirements must also be considered. 1525 7.2.5. Frequency of Load Balance 1527 IGP-TE and RSVP-TE extensions are needed to support frequency of load 1528 balancing rearrangement called for in FR#13, and FR#15-FR#17. 1529 Constraints are not defined in RSVP-TE, but could be modeled after 1530 administrative attribute affinities in RFC3209 and elsewhere. 1532 The primary focus of this document, among the sets of requirements 1533 listed in Section 7.1 is the "load distribution, stability, minimal 1534 disruption" set of requirements. The "path determination, 1535 connectivity verification" must also be considered. The "backward 1536 compatibility and migration" and "general network management" 1537 requirements must also be considered. 1539 7.2.6. Inter-Layer Communication 1541 Lower layer to upper layer communication called for in FR#7 and 1542 FR#20. Specific parameters, specifically delay and delay variation, 1543 need to be addressed. Passing information from a lower non-MPLS 1544 layer to an MPLS layer needs to be addressed, though this may largely 1545 be generic advice encouraging a coupling of MPLS to lower layer 1546 management plane or control plane interfaces. This topic can be 1547 addressed in each document proposing a protocol extension, where 1548 applicable. 1550 The primary focus of this document, among the sets of requirements 1551 listed in Section 7.1 is the "restoration speed" set of requirements. 1552 The "backward compatibility and migration" and "general network 1553 management" requirements must also be considered. 1555 7.2.7. Packet Ordering Requirements 1557 A document is needed to define extensions supporting various packet 1558 ordering requirements, ranging from requirements to preserve 1559 microflow ordering only, to requirements to preserve full LSP 1560 ordering (as in MPLS-TP). This is covered by 1561 [I-D.ietf-mpls-multipath-use] and 1562 [I-D.villamizar-mpls-multipath-extn]. 1564 The primary focus of this document, among the sets of requirements 1565 listed in Section 7.1 are the "admission control, preemption, traffic 1566 engineering" and the "path determination, connectivity verification" 1567 sets of requirements. The "backward compatibility and migration" and 1568 "general network management" requirements must also be considered. 1570 7.2.8. Minimally Disruption Load Balance 1572 The behavior of hash methods used in classic multipath needs to be 1573 described in terms of FR#12 which calls for minimally disruptive load 1574 adjustments. For example, reseeding the hash violates FR#12. Using 1575 modulo operations is significantly disruptive if a link comes or goes 1576 down, as pointed out in [RFC2992]. In addition, backwards 1577 compatibility with older hardware needs to be accommodated. 1579 The primary focus of this document, among the sets of requirements 1580 listed in Section 7.1 is the "load distribution, stability, minimal 1581 disruption" set of requirements. 1583 7.2.9. Path Symmetry 1585 Protocol extensions are needed to support dynamic load balance as 1586 called for to meet FR#22 (path symmetry) and to meet FR#11 (dynamic 1587 placement of flows). 1589 Currently path symmetry can only be supported in link bundling if the 1590 path is pinned. When a flow is moved both ingress and egress must 1591 make the move as close to simultaneously as possible to satisfy FR#22 1592 and FR#12 (minimally disruptive load rebalance). There is currently 1593 no protocol to coordinate this move. 1595 If a group of flows are identified using a hash, then the hash must 1596 be identical on the pair of LSR at the endpoint, using the same hash 1597 seed and with one side swapping source and destination. If the label 1598 stack is used, then either the entire label stack must be a special 1599 case flow identification, since the set of labels in either direction 1600 are not correlated, or the two LSR must conspire to use the same flow 1601 identifier. For example, using a common entropy label value, and 1602 using only the entropy label in the flow identification would satisfy 1603 the forwarding requirement. There is no protocol to indicate special 1604 treatment of a label stack within a hierarchical LSP. Adding such a 1605 extension may add significant complexity and ultimately may prove 1606 unscalable. 1608 The primary focus of this document, among the sets of requirements 1609 listed in Section 7.1 are the "load distribution, stability, minimal 1610 disruption" and the "admission control, preemption, traffic 1611 engineering" sets of requirements. The "backward compatibility and 1612 migration" and "general network management" requirements must also be 1613 considered. Path symmetry simplifies support for the "path 1614 determination, connectivity verification" set of requirements, but 1615 with significant complexity added elsewhere. 1617 7.2.10. Performance, Scalability, and Stability 1619 A separate document providing analysis of performance, scalability, 1620 and stability impacts of changes may be needed. The topic of traffic 1621 adjustment oscillation must also be covered. If sufficient coverage 1622 is provided in each document covering a protocol extension, a 1623 separate document would not be needed. 1625 The primary focus of this document, among the sets of requirements 1626 listed in Section 7.1 is the "restoration speed" set of requirements. 1627 This is not a simple topic and not a topic that is well served by 1628 scattering it over multiple documents, therefore it may be best to 1629 put this in a separate document and put citations in documents called 1630 for in Section 7.2.1, Section 7.2.2, Section 7.2.3, Section 7.2.9, 1631 Section 7.2.11, Section 7.2.12, Section 7.2.13, and Section 7.2.14. 1632 Citation may also be helpful in Section 7.2.4, and Section 7.2.5. 1634 7.2.11. IP and LDP Traffic 1636 A document is needed to define the use of measurements of native IP 1637 and native LDP traffic levels which are then used to reduce link 1638 advertised bandwidth amounts. 1640 The primary focus of this document, among the sets of requirements 1641 listed in Section 7.1 are the "load distribution, stability, minimal 1642 disruption" and the "admission control, preemption, traffic 1643 engineering" set of requirements. The "path determination, 1644 connectivity verification" must also be considered. The "backward 1645 compatibility and migration" and "general network management" 1646 requirements must also be considered. 1648 7.2.12. LDP Extensions 1650 Extending LDP is called for in DR#2. LDP can be extended to couple 1651 FEC admission control to local resource availability without 1652 providing LDP traffic engineering capability. Other LDP extensions 1653 such as signaling a bound on microflow size and LDP LSP requirements 1654 would provide useful information without providing LDP traffic 1655 engineering capability. 1657 The primary focus of this document, among the sets of requirements 1658 listed in Section 7.1 is the "admission control, preemption, traffic 1659 engineering" set of requirements. The "backward compatibility and 1660 migration" and "general network management" requirements must also be 1661 considered. 1663 7.2.13. Pseudowire Extensions 1665 Pseudowire (PW) extensions such as signaling a bound on microflow 1666 size and signaling requirements specific to PW would provide useful 1667 information. This information can be carried in the PW LDP signaling 1668 [RFC3985] and the the PW requirements could then be used in a 1669 containing LSP. 1671 The primary focus of this document, among the sets of requirements 1672 listed in Section 7.1 is the "admission control, preemption, traffic 1673 engineering" set of requirements. The "backward compatibility and 1674 migration" and "general network management" requirements must also be 1675 considered. 1677 7.2.14. Multi-Domain Composite Link 1679 DR#5 calls for Composite Link to span multiple network topologies. 1680 Component LSP may already span multiple network topologies, though 1681 most often in practice these are LDP signaled. Component LSP which 1682 are RSVP-TE signaled may also span multiple network topologies using 1683 at least three existing methods (per domain [RFC5152], BRPC 1684 [RFC5441], PCE [RFC4655]). When such component links are combined in 1685 a Composite Link, the Composite Link spans multiple network 1686 topologies. It is not clear in which document this needs to be 1687 described or whether this description in the framework is sufficient. 1688 The authors and/or the WG may need to discuss this. DR#5 mandates 1689 that IGP-TE extension cannot be used. This would disallow the use of 1690 [RFC5316] or [RFC5392] in conjunction with [RFC5151]. 1692 The primary focus of this document, among the sets of requirements 1693 listed in Section 7.1 are "single vs multiple domain" and "admission 1694 control, preemption, traffic engineering". The "routing information 1695 aggregation" and "load distribution, stability, minimal disruption" 1696 requirements need attention due to their use of the IGP in single 1697 domain Composite Link. Other requirements such as "delay and delay 1698 variation", can more easily be accommodated by carrying metrics 1699 within BGP. The "path determination, connectivity verification" 1700 requirements need attention due to requirements to restrict 1701 disclosure of topology information across domains in multi-domain 1702 deployments. The "backward compatibility and migration" and "general 1703 network management" requirements must also be considered. 1705 7.3. Framework Requirement Coverage by Protocol 1707 As an aid to implementors, this section summarizes requirement 1708 coverage listed in Section 7.2 by protocol or LSR functionality 1709 affected. 1711 Some documentation may be purely informational, proposing no changes 1712 and proposing usage at most. This includes Section 7.2.3, 1713 Section 7.2.8, Section 7.2.10, and Section 7.2.14. 1715 Section 7.2.9 may require a new protocol. 1717 7.3.1. OSPF-TE and ISIS-TE Protocol Extensions 1719 Many of the changes listed in Section 7.2 require IGP-TE changes, 1720 though most are small extensions to provide additional information. 1721 This set includes Section 7.2.1, Section 7.2.2, Section 7.2.5, 1722 Section 7.2.6, and Section 7.2.7. An adjustment to existing 1723 advertised parameters is suggested in Section 7.2.11. 1725 7.3.2. PW Protocol Extensions 1727 The only suggestion of pseudowire (PW) extensions is in 1728 Section 7.2.13. 1730 7.3.3. LDP Protocol Extensions 1732 Potential LDP extensions are described in Section 7.2.12. 1734 7.3.4. RSVP-TE Protocol Extensions 1736 RSVP-TE protocol extensions are called for in Section 7.2.1, 1737 Section 7.2.5, Section 7.2.7, and Section 7.2.9. 1739 7.3.5. RSVP-TE Path Selection Changes 1741 Section 7.2.3 calls for path selection to be addressed in individual 1742 documents that require change. These changes would include those 1743 proposed in Section 7.2.1, Section 7.2.2, Section 7.2.5, and 1744 Section 7.2.7. 1746 7.3.6. RSVP-TE Admission Control and Preemption 1748 When a change is needed to path selection, a corresponding change is 1749 needed in admission control. The same set of sections applies: 1750 Section 7.2.1, Section 7.2.2, Section 7.2.5, and Section 7.2.7. Some 1751 resource changes such as a link delay change might trigger 1752 preemption. The rules of preemption remain unchanged, still based on 1753 holding priority. 1755 7.3.7. Flow Identification and Traffic Balance 1757 The following describe either the state of the art in flow 1758 identification and traffic balance or propose changes: Section 7.2.4, 1759 Section 7.2.5, Section 7.2.7, and Section 7.2.8. 1761 8. IANA Considerations 1763 This is a framework document and therefore does not specify protocol 1764 extensions. This memo includes no request to IANA. 1766 9. Security Considerations 1768 The security considerations for MPLS/GMPLS and for MPLS-TP are 1769 documented in [RFC5920] and [RFC6941]. 1771 The types protocol extensions proposed in this framework document 1772 provide additional information about links, forwarding adjacencies, 1773 and LSP requirements. The protocol semantics changes described in 1774 this framework document propose additional LSP constraints applied at 1775 path computation time and at LSP admission at midpoints LSR. The 1776 additional information and constraints provide no additional security 1777 considerations beyond the security considerations already documented 1778 in [RFC5920] and [RFC6941]. 1780 10. Acknowledgments 1782 Authors would like to thank Adrian Farrel, Fred Jounay, Yuji Kamite 1783 for his extensive comments and suggestions regarding early versions 1784 of this document, Ron Bonica, Nabil Bitar, Eric Gray, Lou Berger, and 1785 Kireeti Kompella for their reviews of early versions and great 1786 suggestions. 1788 Authors would like to thank Iftekhar Hussain for review and 1789 suggestions regarding recent versions of this document. 1791 In the interest of full disclosure of affiliation and in the interest 1792 of acknowledging sponsorship, past affiliations of authors are noted. 1793 Much of the work done by Ning So occurred while Ning was at Verizon. 1794 Much of the work done by Curtis Villamizar occurred while at 1795 Infinera. Infinera continues to sponsor this work on a consulting 1796 basis. 1798 11. References 1799 11.1. Normative References 1801 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1802 Requirement Levels", BCP 14, RFC 2119, March 1997. 1804 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 1805 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 1806 Tunnels", RFC 3209, December 2001. 1808 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 1809 (TE) Extensions to OSPF Version 2", RFC 3630, 1810 September 2003. 1812 [RFC4201] Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling 1813 in MPLS Traffic Engineering (TE)", RFC 4201, October 2005. 1815 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 1816 Hierarchy with Generalized Multi-Protocol Label Switching 1817 (GMPLS) Traffic Engineering (TE)", RFC 4206, October 2005. 1819 [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP 1820 Specification", RFC 5036, October 2007. 1822 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 1823 Engineering", RFC 5305, October 2008. 1825 [RFC5712] Meyer, M. and JP. Vasseur, "MPLS Traffic Engineering Soft 1826 Preemption", RFC 5712, January 2010. 1828 [RFC6107] Shiomoto, K. and A. Farrel, "Procedures for Dynamically 1829 Signaled Hierarchical Label Switched Paths", RFC 6107, 1830 February 2011. 1832 [RFC6374] Frost, D. and S. Bryant, "Packet Loss and Delay 1833 Measurement for MPLS Networks", RFC 6374, September 2011. 1835 [RFC6391] Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan, 1836 J., and S. Amante, "Flow-Aware Transport of Pseudowires 1837 over an MPLS Packet Switched Network", RFC 6391, 1838 November 2011. 1840 11.2. Informative References 1842 [DBP] Bertsekas, D., "Dynamic Behavior of Shortest Path Routing 1843 Algorithms for Communication Networks", IEEE Trans. Auto. 1844 Control 1982. 1846 [I-D.atlas-mpls-te-express-path] 1847 Atlas, A., Drake, J., Giacalone, S., Ward, D., Previdi, 1848 S., and C. Filsfils, "Performance-based Path Selection for 1849 Explicitly Routed LSPs", 1850 draft-atlas-mpls-te-express-path-02 (work in progress), 1851 February 2013. 1853 [I-D.ietf-mpls-multipath-use] 1854 Villamizar, C., "Use of Multipath with MPLS-TP and MPLS", 1855 draft-ietf-mpls-multipath-use-00 (work in progress), 1856 February 2013. 1858 [I-D.ietf-ospf-te-metric-extensions] 1859 Giacalone, S., Ward, D., Drake, J., Atlas, A., and S. 1860 Previdi, "OSPF Traffic Engineering (TE) Metric 1861 Extensions", draft-ietf-ospf-te-metric-extensions-04 (work 1862 in progress), June 2013. 1864 [I-D.ietf-rtgwg-cl-requirement] 1865 Villamizar, C., McDysan, D., Ning, S., Malis, A., and L. 1866 Yong, "Requirements for MPLS Over a Composite Link", 1867 draft-ietf-rtgwg-cl-requirement-08 (work in progress), 1868 August 2012. 1870 [I-D.ietf-rtgwg-cl-use-cases] 1871 Ning, S., Malis, A., McDysan, D., Yong, L., and C. 1872 Villamizar, "Composite Link Use Cases and Design 1873 Considerations", draft-ietf-rtgwg-cl-use-cases-01 (work in 1874 progress), August 2012. 1876 [I-D.kompella-mpls-rsvp-ecmp] 1877 Kompella, K., "Multi-path Label Switched Paths Signaled 1878 Using RSVP-TE", draft-kompella-mpls-rsvp-ecmp-03 (work in 1879 progress), May 2013. 1881 [I-D.ospf-cc-stlv] 1882 Osborne, E., "Component and Composite Link Membership in 1883 OSPF", draft-ospf-cc-stlv-00 (work in progress), 1884 August 2011. 1886 [I-D.previdi-isis-te-metric-extensions] 1887 Previdi, S., Giacalone, S., Ward, D., Drake, J., Atlas, 1888 A., and C. Filsfils, "IS-IS Traffic Engineering (TE) 1889 Metric Extensions", 1890 draft-previdi-isis-te-metric-extensions-03 (work in 1891 progress), February 2013. 1893 [I-D.villamizar-mpls-multipath-extn] 1894 Villamizar, C., "Multipath Extensions for MPLS Traffic 1895 Engineering", draft-villamizar-mpls-multipath-extn-00 1896 (work in progress), November 2012. 1898 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 1899 and W. Weiss, "An Architecture for Differentiated 1900 Services", RFC 2475, December 1998. 1902 [RFC2991] Thaler, D. and C. Hopps, "Multipath Issues in Unicast and 1903 Multicast Next-Hop Selection", RFC 2991, November 2000. 1905 [RFC2992] Hopps, C., "Analysis of an Equal-Cost Multi-Path 1906 Algorithm", RFC 2992, November 2000. 1908 [RFC3260] Grossman, D., "New Terminology and Clarifications for 1909 Diffserv", RFC 3260, April 2002. 1911 [RFC3468] Andersson, L. and G. Swallow, "The Multiprotocol Label 1912 Switching (MPLS) Working Group decision on MPLS signaling 1913 protocols", RFC 3468, February 2003. 1915 [RFC3945] Mannie, E., "Generalized Multi-Protocol Label Switching 1916 (GMPLS) Architecture", RFC 3945, October 2004. 1918 [RFC3985] Bryant, S. and P. Pate, "Pseudo Wire Emulation Edge-to- 1919 Edge (PWE3) Architecture", RFC 3985, March 2005. 1921 [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, 1922 "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for 1923 Use over an MPLS PSN", RFC 4385, February 2006. 1925 [RFC4448] Martini, L., Rosen, E., El-Aawar, N., and G. Heron, 1926 "Encapsulation Methods for Transport of Ethernet over MPLS 1927 Networks", RFC 4448, April 2006. 1929 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 1930 Element (PCE)-Based Architecture", RFC 4655, August 2006. 1932 [RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding Equal 1933 Cost Multipath Treatment in MPLS Networks", BCP 128, 1934 RFC 4928, June 2007. 1936 [RFC5151] Farrel, A., Ayyangar, A., and JP. Vasseur, "Inter-Domain 1937 MPLS and GMPLS Traffic Engineering -- Resource Reservation 1938 Protocol-Traffic Engineering (RSVP-TE) Extensions", 1939 RFC 5151, February 2008. 1941 [RFC5152] Vasseur, JP., Ayyangar, A., and R. Zhang, "A Per-Domain 1942 Path Computation Method for Establishing Inter-Domain 1943 Traffic Engineering (TE) Label Switched Paths (LSPs)", 1944 RFC 5152, February 2008. 1946 [RFC5316] Chen, M., Zhang, R., and X. Duan, "ISIS Extensions in 1947 Support of Inter-Autonomous System (AS) MPLS and GMPLS 1948 Traffic Engineering", RFC 5316, December 2008. 1950 [RFC5392] Chen, M., Zhang, R., and X. Duan, "OSPF Extensions in 1951 Support of Inter-Autonomous System (AS) MPLS and GMPLS 1952 Traffic Engineering", RFC 5392, January 2009. 1954 [RFC5441] Vasseur, JP., Zhang, R., Bitar, N., and JL. Le Roux, "A 1955 Backward-Recursive PCE-Based Computation (BRPC) Procedure 1956 to Compute Shortest Constrained Inter-Domain Traffic 1957 Engineering Label Switched Paths", RFC 5441, April 2009. 1959 [RFC5920] Fang, L., "Security Framework for MPLS and GMPLS 1960 Networks", RFC 5920, July 2010. 1962 [RFC5921] Bocci, M., Bryant, S., Frost, D., Levrau, L., and L. 1963 Berger, "A Framework for MPLS in Transport Networks", 1964 RFC 5921, July 2010. 1966 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 1967 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 1968 RFC 6790, November 2012. 1970 [RFC6941] Fang, L., Niven-Jenkins, B., Mansfield, S., and R. 1971 Graveman, "MPLS Transport Profile (MPLS-TP) Security 1972 Framework", RFC 6941, April 2013. 1974 Authors' Addresses 1976 So Ning 1977 Tata Communications 1979 Email: ning.so@tatacommunications.com 1981 Dave McDysan 1982 Verizon 1983 22001 Loudoun County PKWY 1984 Ashburn, VA 20147 1986 Email: dave.mcdysan@verizon.com 1987 Eric Osborne 1988 Cisco 1990 Email: eosborne@cisco.com 1992 Lucy Yong 1993 Huawei USA 1994 5340 Legacy Dr. 1995 Plano, TX 75025 1997 Phone: +1 469-277-5837 1998 Email: lucy.yong@huawei.com 2000 Curtis Villamizar 2001 Outer Cape Cod Network Consulting 2003 Email: curtis@occnc.com