idnits 2.17.1 draft-ietf-mpls-spring-entropy-label-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 24, 2017) is 2559 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1000' on line 373 -- Looks like a reference, but probably isn't: '1999' on line 373 == Unused Reference: 'RFC4206' is defined on line 994, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-spring-segment-routing-11 == Outdated reference: A later version (-13) exists of draft-ietf-isis-mpls-elc-02 == Outdated reference: A later version (-15) exists of draft-ietf-ospf-mpls-elc-04 == Outdated reference: A later version (-07) exists of draft-ietf-isis-l2bundles-04 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Kini 3 Internet-Draft 4 Intended status: Informational K. Kompella 5 Expires: October 26, 2017 Juniper 6 S. Sivabalan 7 Cisco 8 S. Litkowski 9 Orange 10 R. Shakir 11 Google 12 J. Tantsura 13 April 24, 2017 15 Entropy label for SPRING tunnels 16 draft-ietf-mpls-spring-entropy-label-05 18 Abstract 20 Source routed tunnels with label stacking is a technique that can be 21 leveraged to provide a method to steer a packet through a controlled 22 set of segments. This can be applied to the Multi Protocol Label 23 Switching (MPLS) data plane. Entropy label (EL) is a technique used 24 in MPLS to improve load balancing. This document examines and 25 describes how ELs are to be applied to source routed tunnels with 26 label stacks. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at http://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on October 26, 2017. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 64 2. Abbreviations and Terminology . . . . . . . . . . . . . . . . 4 65 3. Use-case requiring multipath load balancing . . . . . . . . . 4 66 4. Entropy Readable Label Depth . . . . . . . . . . . . . . . . 5 67 5. Maximum SID Depth . . . . . . . . . . . . . . . . . . . . . . 7 68 6. LSP stitching using the binding SID . . . . . . . . . . . . . 8 69 7. Insertion of entropy labels for SPRING path . . . . . . . . . 10 70 7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10 71 7.1.1. Example 1 . . . . . . . . . . . . . . . . . . . . . . 11 72 7.1.2. Example 2 . . . . . . . . . . . . . . . . . . . . . . 12 73 7.2. Considerations for the placement of entropy labels . . . 12 74 7.2.1. ERLD value . . . . . . . . . . . . . . . . . . . . . 13 75 7.2.2. Segment type . . . . . . . . . . . . . . . . . . . . 14 76 7.2.2.1. Node-SID . . . . . . . . . . . . . . . . . . . . 14 77 7.2.2.2. Adjacency-SID representing an ECMP bundle . . . . 14 78 7.2.2.3. Adjacency-SID representing a single IP link . . . 15 79 7.2.2.4. Adjacency-SID representing a single link within a 80 L2 bundle . . . . . . . . . . . . . . . . . . . . 15 81 7.2.2.5. Adjacency-SID representing a L2 bundle . . . . . 15 82 7.2.3. Maximizing number of LSRs that will loadbalance . . . 15 83 7.2.4. Preference for a part of the path . . . . . . . . . . 16 84 7.2.5. Combining criteria . . . . . . . . . . . . . . . . . 16 85 8. A simple algorithm example . . . . . . . . . . . . . . . . . 16 86 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 17 87 10. Options considered . . . . . . . . . . . . . . . . . . . . . 18 88 10.1. Single EL at the bottom of the stack of tunnels . . . . 18 89 10.2. An EL per tunnel in the stack . . . . . . . . . . . . . 18 90 10.3. A re-usable EL for a stack of tunnels . . . . . . . . . 19 91 10.4. EL at top of stack . . . . . . . . . . . . . . . . . . . 20 92 10.5. ELs at readable label stack depths . . . . . . . . . . . 20 94 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 95 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 21 96 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 97 14. Security Considerations . . . . . . . . . . . . . . . . . . . 21 98 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 99 15.1. Normative References . . . . . . . . . . . . . . . . . . 21 100 15.2. Informative References . . . . . . . . . . . . . . . . . 22 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 103 1. Introduction 105 The source routed tunnels with label stacking paradigm is leveraged 106 by techniques such as Segment Routing (SR) 107 [I-D.ietf-spring-segment-routing] to steer a packet through a set of 108 segments. This can be directly applied to the MPLS data plane, but 109 it has implications on the label stack depth. 111 Clarifying statements on label stack depth have been provided in 112 [RFC7325] but the RFC does not address the case of source routed 113 stacked MPLS tunnels as described in 114 [I-D.ietf-spring-segment-routing] where deeper label stacks are more 115 prevalent. 117 Entropy label (EL) [RFC6790] is a technique used in the MPLS data 118 plane to provide entropy for load balancing. When using LSP 119 hierarchies there are implications on how [RFC6790] should be 120 applied. The current document addresses the case where the hierarchy 121 is created at a single LSR as required by source routed tunnels with 122 label stacks. 124 A use-case requiring load balancing with source routed tunnels with 125 label stacks is given in Section 3. A recommended solution is 126 described in Section 7 keeping in consideration the limitations of 127 implementations when applying [RFC6790] to deeper label stacks. 128 Options that were considered to arrive at the recommended solution 129 are documented for historical purposes in Section 10. 131 1.1. Requirements Language 133 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 134 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 135 document are to be interpreted as described in [RFC2119]. 137 Although this document is not a protocol specification, the use of 138 this language clarifies the instructions to protocol designers 139 producing solutions that satisfy the requirements set out in this 140 document. 142 2. Abbreviations and Terminology 144 EL - Entropy Label 146 ELI - Entropy Label Identifier 148 ELC - Entropy Label Capability 150 ERLD - Entropy Readable Label Depth 152 SR - Segment Routing 154 ECMP - Equal Cost Multi Paths 156 LSR - Label Switch Router 158 MPLS - Multiprotocol Label Switching 160 MSD - Maximum SID Depth 162 SID - Segment Identifier 164 RLD - Readable Label Depth 166 OAM - Operation, Administration and Maintenance 168 3. Use-case requiring multipath load balancing 170 +------+ 171 | | 172 +-------| P3 |-----+ 173 | +-----| |---+ | 174 L3| |L4 +------+ L1| |L2 +----+ 175 | | | | +--| P4 |--+ 176 +-----+ +-----+ +-----+ | +----+ | +-----+ 177 | S |-----| P1 |------------| P2 |--+ +--| D | 178 | | | | | |--+ +--| | 179 +-----+ +-----+ +-----+ | +----+ | +-----+ 180 +--| P5 |--+ 181 +----+ 182 S=Source LSR, D=Destination LSR, P1,P2,P3,P4,P5=Transit LSRs, 183 L1,L2,L3,L4=Links 185 Figure 1: Traffic engineering use-case 187 Traffic-engineering (TE) is one of the applications of MPLS and is 188 also a requirement for source routed tunnels with label stacks 190 [RFC7855]. Consider the topology shown in Figure 1. The LSR S 191 requires data to be sent to LSR D along a traffic- engineered path 192 that goes over the link L1. Good load balancing is also required 193 across equal cost paths (including parallel links). To engineer 194 traffic along a path that takes link L1, the label stack that LSR S 195 creates consists of a label to the node SID of LSR P3, stacked over 196 the label for the adjacency SID of link L1 and that in turn is 197 stacked over the label to the node SID of LSR D. For simplicity lets 198 assume that all LSRs use the same label space (SRGB) for source 199 routed label stacks. Let L_N-Px denote the label to be used to reach 200 the node SID of LSR Px. Let L_A-Ln denote the label used for the 201 adjacency SID for link Ln. The LSR S must use the label stack for traffic-engineering. However to achieve good 203 load balancing over the equal cost paths P2-P4-D, P2-P5-D and the 204 parallel links L3, L4, a mechanism such as Entropy labels [RFC6790] 205 should be adapted for source routed label stacks. Indeed, the SPRING 206 architecture with the MPLS dataplane uses nested MPLS LSPs composing 207 the source routed label stacks. As each MPLS node may have 208 limitations in the number of labels it can push when it is ingress or 209 inspect when doing loadbalancing, entropy labels insertion strategy 210 becomes important to keep benefit of the loadbalancing. Multiple 211 ways to apply entropy labels were considered and are documented in 212 Section 10 along with their tradeoffs. A recommended solution is 213 described in Section 7. 215 4. Entropy Readable Label Depth 217 The Entropy Readable Label Depth (ERLD) is defined as the number of 218 labels a router can: 220 a. read in an MPLS packet received on its incoming interface 221 (starting from the top of the stack) and 223 b. use in its loadbalancing function. 225 The ERLD means that the router will perform load-balancing using the 226 EL label if the EL is placed within the ERLD first labels. 228 A router capable of reading N labels but not using an EL located 229 within those N labels MUST consider its ERLD to be 0. In a 230 distributed switching architecture, each linecard may have a 231 different capability in term of ERLD. For simplicity reason, an 232 implementation MAY use the minimum ERLD between each linecard as the 233 ERLD value for the system. 235 Examples: 237 | Payload | 238 +----------+ 239 | Payload | | EL | P7 240 +----------+ +----------+ 241 | Payload | | EL | | ELI | 242 +----------+ +----------+ +----------+ 243 | Payload | | EL | | ELI | | Label 50 | 244 +----------+ +----------+ +----------+ +----------+ 245 | Payload | | EL | | ELI | | Label 40 | | Label 40 | 246 +----------+ +----------+ +----------+ +----------+ +----------+ 247 | EL | | ELI | | Label 30 | | Label 30 | | Label 30 | 248 +----------+ +----------+ +----------+ +----------+ +----------+ 249 | ELI | | Label 20 | | Label 20 | | Label 20 | | Label 20 | 250 +----------+ +----------+ +----------+ +----------+ +----------+ 251 | Label 16 | | Label 16 | | Label 16 | | Label 16 | | Label 16 | P1 252 +----------+ +----------+ +----------+ +----------+ +----------+ 253 Packet 1 Packet 2 Packet 3 Packet 4 Packet 5 255 Figure 2: Label stacks with ELI/EL 257 In the figure below, we consider the displayed packets received on a 258 router interface. We consider also a single ERLD value for the 259 router. 261 o If the router has an ERLD of 3, it will be able to loadbalance the 262 Packet 1 displayed in the Figure 2 using the EL as part of the 263 loadbalancing keys. The ERLD value of 3 means that the router can 264 read and take into account the entropy label for loadbalancing if 265 it is placed between position 1 (top) and position 3. 267 o If the router has an ERLD of 5, it will be able to loadbalance the 268 Packet 1 to 3 in the Figure 2 using the EL as part of the 269 loadbalancing keys. The Packet 4 and 5 have the EL placed at a 270 position greater than 5, so the router is not able to read it and 271 take it into account during the hashing. 273 o If the router has an ERLD of 10, it will be able to loadbalance 274 all the packets displayed in the Figure 2 using the EL as part of 275 the loadbalancing keys. 277 To allow an efficient loadbalancing based on entropy labels, a router 278 running SPRING SHOULD advertise its ERLD (or ERLDs), so all the other 279 SPRING routers in the network are aware of its capability. How this 280 advertisement is done is out of scope of this document. 282 To advertise an ERLD value, a SPRING router: 284 o MUST be entropy label capable and as a consequence MUST apply all 285 the procedures defined in [RFC6790]. 287 o MUST be able to read an ELI/EL which is located within its ERLD 288 value 290 o MUST take into account this EL in its load balancing function 292 5. Maximum SID Depth 294 The Maximum SID Depth defines the maximum number of labels that a 295 particular node can impose on a packet. This includes any kind of 296 labels (service, entropy, transport...). In an MPLS network, the MSD 297 is a limit of the Ingress LSR (I-LSR) or any stitching node that 298 would perform an imposition of additional labels on an existing label 299 stack. 301 Depending of the number of MPLS operations (POP, SWAP...) to be 302 performed before the PUSH, the MSD may vary due to the hardware or 303 software limitations. As for the ERLD, there may also be different 304 MSD limits based on the linecard type used in a distributed switching 305 system. 307 When an external controller is used to program a label stack on a 308 particular node, this node MAY advertise its MSD value or a subset of 309 its MSD value to the controller. How this advertisement is done is 310 out of scope of this document. As the controller does not have the 311 knowledge of the entire label stack to be pushed by the node, the 312 node may advertise an MSD value which is lower than its real limit. 313 This gives the ability for the controller to program a label stack up 314 to the advertised MSD value while leaving room for the local node to 315 add more labels (e.g. service, entropy, transport...) without 316 reaching the hardware/software limit. 318 P7 ---- P8 ---- P9 319 / \ 320 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- PE2 321 | \ | 322 ----> P10 \ | 323 IP Pkt | \ | 324 P11 --- P12 --- P13 325 100 10000 327 Figure 3 329 In the Figure 3, an IP packet comes in the MPLS network at PE1. All 330 metrics are considered equal to 1 except P12-P13 which is 10000 and 331 P11-P12 which is 100. PE1 wants to steer the traffic using a SPRING 332 path to PE2 along 333 PE1->P1->P7->P8->P9->P4->P5->P10->P11->P12->P13->PE2. By using 334 Adjacency SIDs only, PE1 will be required to push (as an I-LSR) 10 335 labels on the IP packet received, it so requires an MSD of 10. If 336 the IP packet should be carried over an MPLS service like a regular 337 layer 3 VPN, an additional service label will be imposed, requiring 338 an MSD of 11 for PE1. In addition, if PE1 wants to insert an ELI/EL 339 for loadbalancing purpose, PE1 will need to push 13 labels on the IP 340 packet requiring an MSD of 13. 342 In the SPRING architecture, Node SIDs or Binding SIDs can be used to 343 reduce the label stack size. As an example, to steer the traffic on 344 the same path as before, PE1 may be able to use the following label 345 stack: . In this example we 346 consider a combination of Node SIDs and a Binding SID advertised by 347 P5 that will stitch the traffic along the path P10->P11->P12->P13. 348 The instruction associated with the binding SID at P5 is thus to swap 349 Binding_P5 to Adj_P12-P13 and then push . P5 350 acts as a stitching node that pushes additional labels on an existing 351 label stack, P5 MSD needs also to be taken into account and may limit 352 the number of labels that could be imposed. 354 6. LSP stitching using the binding SID 356 The binding SID allows to bind a segment identifier to an existing 357 LSP. As examples, the binding SID can represent an RSVP-TE tunnel, 358 an LDP path (through the mapping server advertisement), a SPRING 359 path... Each LSP associated with a binding SID has its own entropy 360 label capability. 362 In the figure 3, if we consider that: 364 o P6, PE2, P10, P11, P12 are pure LDP routers. 366 o PE1, P1, P2, P3, P4, P7, P8, P9 are pure SPRING routers. 368 o P5 is running SPRING and LDP. 370 o P5 acts as a mapping server (MS) and advertises Prefix SIDs for 371 the LDP FECs: an index value of 20 is used for PE2. 373 o All SPRING routers use an SRGB of [1000, 1999]. 375 o P6 advertises label 20 for the PE2 FEC. 377 o Traffic from PE1 to PE2 uses the shortest path. 379 PE1 ----- P1 -- P2 -- P3 -- P4 ---- P5 --- P6 --- PE2 381 --> +----+ +----+ +----+ +----+ 382 IP Pkt | IP | | IP | | IP | | IP | 383 +----+ +----+ +----+ +----+ 384 |1020| |1020| | 20 | 385 +----+ +----+ +----+ 386 SPRING LDP 388 In term of packet forwarding, by learning the MS advertisement from 389 PE5, PE1 imposes a label 1020 to an IP packet destinated to PE2. 390 SPRING routers along the shortest path to PE2 will switch the traffic 391 until it reaches P5 which will perform the LSP stitching. P5 will 392 swap the SPRING label 1020 to the LDP label 20 advertised by the 393 nexthop P6. P6 will then forward the packet using the LDP label 394 towards PE2. 396 PE1 cannot push an ELI/EL for the binding SID without knowing that 397 the tail-end of the LSP associated with the binding (PE2) is entropy 398 label capable. 400 To accomodate the mix of signalling protocols involved during the 401 stitching, the entropy label capability SHOULD be propagated between 402 the signalling protocols. Each binding SID SHOULD have its own 403 entropy label capability that MUST be inherited from the entropy 404 label capability of the associated LSP. If the router advertising 405 the binding SID does not know the ELC state of the target FEC, it 406 MUST NOT set the ELC for the binding SID. An ingress node MUST NOT 407 push an ELI/EL associated to a binding SID unless this binding SID 408 has the entropy label capability. How the entropy label capability 409 is advertised for a binding SID is out of scope of this document. 411 In our example, if PE2 is LDP entropy label capable, it will add the 412 entropy label capability in its LDP advertisement. When P5 receives 413 the FEC/label binding for PE2, it learns about the ELC and can set 414 the ELC in the mapping server advertisement. Thus PE1 learns about 415 the ELC of PE2 and may push an ELI/EL associated with the binding 416 SID. 418 The proposed solution works only if the SPRING router advertising the 419 binding SID is also performing the dataplane LSP stitching. In our 420 example, if the mapping server function is hosted on P8 instead of 421 P5, P8 does not know about the ELC state of PE2 LDP FEC. As a 422 consequence, it does not set the ELC on the associated binding SID. 424 7. Insertion of entropy labels for SPRING path 426 7.1. Overview 428 The solution described in this section follows [RFC6790]. Within a 429 SPRING path, a node may be ingress, egress, transit (regarding the 430 entropy label processing described in [RFC6790] or it can be any 431 combination of those. For example: 433 o The ingress node of a SPRING domain may be an ingress node from an 434 entropy label perspective. 436 o Any LSR terminating a segment of the SPRING path is an egress node 437 (because it terminates the segment) but may also be a transit node 438 if the SPRING path is not ended here because there is a subsequent 439 SPRING MPLS label in the stack. 441 o Any LSR processing a binding SID may be a transit node and an 442 ingress node (because it may push additional labels when 443 processing the binding SID). 445 As described earlier, an LSR may have a limitation on the depth of 446 the label stack that it can read and process in order to do multipath 447 load balancing based on entropy labels: we called it the ERLD. 449 If an EL does not occur within the ERLD of an LSR in the label stack 450 of the MPLS packet that it receives, then it would lead to poor load 451 balancing at that LSR. Hence an ELI/EL pair MUST be within the ERLD 452 of the LSR in order for the LSR to use the EL during load balancing. 454 Adding a single ELI/EL pair for the entire SPRING path may lead also 455 to poor loadbalancing as well because the EL/ELI may not occur within 456 the ERLD of some LSR on the path (if too deep) or may not be present 457 anymore in the stack for some LSRs if too shallow. 459 In order for the EL to occur within the ERLD of LSRs along the path 460 corresponding to a SPRING label stack, multiple pairs MAY 461 be inserted in this label stack. 463 The insertion of the ELI/EL SHOULD occur only with a SPRING label 464 advertised by an LSR that advertised an ERLD (the LSR is entropy 465 label capable) or with a SPRING label associated with a binding SID 466 that has the ELC set. 468 The ELs among multiple pairs inserted in the stack MAY be 469 same or different. The LSR that inserts pairs MAY have 470 limitations on the number of such pairs that it can insert and also 471 the depth at which it can insert them. If due to any limitation, the 472 inserted ELs are at positions such that an LSR along the path 473 receives an MPLS packet without an EL in the label stack within that 474 LSR's ERLD, then the load balancing performed by that LSR would be 475 poor. An implementation MAY consider multiple criterias when 476 inserting pairs. 478 7.1.1. Example 1 480 ECMP LAG LAG 481 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- PE2 483 Figure 4 485 In the Figure 4, PE1 wants to forward some MPLS VPN traffic over an 486 explicit path to PE2 resulting in the following label stack to be 487 pushed onto the received IP header: {VPN_label, Adj_P6PE2, Adj_P5P6, 488 Adj_P4P5, Adj_P3P4, Adj_Bundle_P2P3, Adj_P1P2}. PE1 is limited to 489 push a maximum of 11 labels (MSD=11). P2, P3 and P6 have an ERLD of 490 3 while others have an ERLD of 10. 492 PE1 can only add two ELI/EL pairs in the label stack due to its MSD 493 limitation. It should place them in a smart way to benefit of load 494 balancing along the longest part of the path. 496 PE1 may take into account multiple parameters when placing the ELs, 497 as examples: 499 o the ERLD value advertised by transit nodes. 501 o the requirement of load balancing for a particular label value. 503 o any service provider preference: favor beginning of the path or 504 end of the path. 506 In the Figure 4, a good strategy may be to use the following stack 507 {VPN_label, ELI2,EL2, Adj_P6PE2, Adj_P5P6, Adj_P4P5, Adj_P3P4, ELI1, 508 EL1, Adj_Bundle_P2P3, Adj_P1P2}. The original stack requests P2 to 509 forward based on a bundle Adjacency segment that will require load 510 balancing. Therefore it is important to ensure that P2 can 511 loadbalance correctly. As P2 has a limited ERLD of 3, ELI/EL must be 512 inserted just next to the label P2 will use to forward. On the path 513 to PE2, P3 has also a limited ERLD, but P3 will forward based on a 514 basic adjacency segment that may require no load balancing. 515 Therefore it does not seem important to ensure that P3 can do load 516 balancing despite of its limited ERLD. The next nodes along the 517 forwarding path have a high ERLD that does not cause any issue, 518 except P6, moreover P6 is using some LAGs to PE2 and is so expected 519 to loadbalance. It becomes important to insert a new ELI/EL just 520 next to P6 forwarding label. 522 In the case above, the ingress node had enough label push capacity to 523 ensure end to end load balancing taking into the path attributes. 524 There might be some cases, where the ingress node may not have the 525 necessary label push capacity. 527 7.1.2. Example 2 529 ECMP LAG ECMP ECMP 530 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- P7 --- P8 --- PE2 532 Figure 5 534 In the Figure 5, PE1 wants to forward MPLS VPN traffic over an 535 explicit path to PE2 resulting in the following label stack to be 536 pushed onto the IP header : {VPN_label, Adj_Bundle_P8PE2, Adj_P7P8, 537 Adj_Bundle_P6P7, Adj_P5P6, Adj_P4P5, Adj_P3P4, Adj_Bundle_P2P3, 538 Adj_P1P2}. PE1 is limited to push a maximum of 11 labels, P2, P3 and 539 P6 have a ERLD of 3 while others have a ERLD of 15. 541 Using a similar strategy as the previous case may lead to a dilemma, 542 as PE1 can only push a single ELI/EL while we may need a minimum of 543 three to loadbalance the end to end path. An optimized stack that 544 would enable end-to-end load balancing may be: {VPN_label, ELI3, EL3, 545 Adj_Bundle_P8PE2, Adj_P7P8, ELI2, EL2, Adj_Bundle_P6P7, Adj_P5P6, 546 Adj_P4P5, Adj_P3P4, ELI1, EL1, Adj_Bundle_P2P3, Adj_P1P2}. 548 A decision needs to be taken to favor some part of the path for load 549 balancing considering that load balancing may not work on the other 550 part. A service provider may decide to place the ELI/EL after P6 551 forwarding label as it will allow P4 and P6 to loadbalance. Placing 552 the ELI/EL at bottom of the stack is also a possibility enabling load 553 balancing for P4 and P8. 555 7.2. Considerations for the placement of entropy labels 557 The sample cases described in the previous section shown that placing 558 the ELI/EL when the maximum number of labels to be pushed is limited 559 is not an easy decision and multiple criteria may be taken into 560 account. 562 This section describes some considerations that could be taken into 563 account when placing ELI/ELs. This list of criteria is not 564 considered as exhaustive and an implementation MAY take into account 565 additional criteria or tie breakers that are not documented here. 567 An implementation SHOULD try to maximize the load-balancing where 568 multiple ECMP paths are available and minimize the number of EL/ELIs 569 that need to be inserted. In case of trade-off, an implementation 570 MAY provide flexibility to operator to select the criteria to be 571 taken into account when placing EL/ELIs or the sub-objective to be 572 optimized for. 574 PE1 -- P1 -- P2 -- P3 -- P4 -- P5 -- ... -- P8 -- P9 -- PE2 575 | | 576 P3'--- P4'--- P5' 578 Figure 6 580 The figure above will be used as reference in the following sub 581 sections. 583 7.2.1. ERLD value 585 As mentioned in Section 7.1, the ERLD value is an important parameter 586 to take into account when inserting ELI/EL as if an ELI/EL does not 587 fall within the ERLD of a node on the path, the node will not be able 588 to loadbalance the traffic in an efficient way. 590 The ERLD value can be advertised via protocols and those extensions 591 are described in separate documents [I-D.ietf-isis-mpls-elc] and 592 [I-D.ietf-ospf-mpls-elc]. 594 Let's consider a path from PE1 to PE2 using the following stack 595 pushed by PE1: {Service_label, Adj_PE2P9, Node_P9, Adj_P1P2}. 597 Using the ERLD as an input parameter may help to minimize the number 598 of required ELI/EL pairs to be inserted. An ERLD value must be 599 retrieved for each SPRING label in the label stack. 601 For a label bound to an adjacency segment, the ERLD is the ERLD of 602 the node that advertised the adjacency segment. In the example 603 above, the ERLD associated with Adj_P1P2 would be the ERLD of router 604 P1 as P1 will perform the forwarding based on Adj_P1P2 label. 606 For a label bound to a node segment, multiple strategies MAY be 607 implemented. An implementation may try to evaluate the minimum ERLD 608 value along the node segment path. If the implementation cannot find 609 the minimum ERLD along the path of the segment, it can use the ERLD 610 of the starting node instead. In the example above, if the 611 implementation supports computation of minimum ERLD along the path, 612 the ERLD associated to label Node_P9 would be the minimum ERLD 613 between nodes {P2,P3,P4 ..., P8}. If the implementation does not 614 support the computation of minimum ERLD, it should consider the ERLD 615 of P2 (starting node that will forward based on Node_P9 label). 617 For a label bound to a binding segment, if the binding segment 618 describes a path, an implementation may also try to evaluate the 619 minimum ERLD along this path. If the implementation cannot find the 620 minimum ERLD along the path of the segment, it can use the ERLD of 621 the starting node instead. 623 7.2.2. Segment type 625 Depending of the type of segment a particular label is bound to, an 626 implementation may deduce that this particular label will be subject 627 to load balancing on the path. 629 7.2.2.1. Node-SID 631 An MPLS label bound to a Node-SID represents a path that may cross 632 multiple hops. Load balancing may be needed on the node starting 633 this path but also on any node along the path. 635 Let's consider a path from PE1 to PE2 using the following stack 636 pushed by PE1: {Service_label, Adj_PE2P9, Node_P9, Adj_P1P2}. 638 If, for example, PE1 is limited to 6 labels to be pushed, it can add 639 a single ELI/EL within the label stack. An operator may want to 640 favor a placement that would allow load balancing along the node-SID 641 path. In the figure above, P3 which is along the node-SID path 642 requires load balancing on two equal cost paths. 644 An implementation may try to evaluate if load balancing is really 645 required within a node segment path: this could be done by running 646 additional SPT computation and by analysis the node segment path. So 647 a node segment that does not really require load balancing may not be 648 preferred when placing EL/ELIs. Such inspection may be time 649 consuming for implementations without 100% guarantee, as a node 650 segment path may use LAG that could be invisible from the IP 651 topology. A simpler approach would be to consider that a label bound 652 to a Node-SID will be subject to load balancing and so requires an 653 EL/ELI. 655 7.2.2.2. Adjacency-SID representing an ECMP bundle 657 When an adjacency segment representing an ECMP bundle is used within 658 a label stack, an implementation can deduce that load balancing is 659 expected at the node that advertised this adjacency segment. An 660 implementation could then favor this particular label value when 661 placing ELI/ELs. 663 7.2.2.3. Adjacency-SID representing a single IP link 665 When an adjacency segment representing a single IP link is used 666 within a label stack, an implementation can deduce that load 667 balancing may not be expected at the node that advertised this 668 adjacency segment. 670 The implementation could then decide to place ELI/ELs to favor other 671 LSRs than the one advertising this adjacency segment. 673 Readers should note that an adjacency segment representing a single 674 IP link may require load balancing. This is the case when a LAG (L2 675 bundle) is implemented between two IP nodes and L2 bundle SR 676 extensions [I-D.ietf-isis-l2bundles] are not implemented. In such 677 case, it may be interesting to keep the possibility to insert an EL/ 678 ELI in a readable position for the LSR advertising the label 679 associated to the adjacency segment. 681 7.2.2.4. Adjacency-SID representing a single link within a L2 bundle 683 When L2 bundle SR extensions [I-D.ietf-isis-l2bundles] are used, 684 adjacency segments may be advertised for each member of the bundle. 685 In this case, an implementation can deduce that load balancing is not 686 expected on the LSR advertising this segment and could then decide to 687 place ELI/ELs to favor other LSRs than the one advertising this 688 adjacency segment. 690 7.2.2.5. Adjacency-SID representing a L2 bundle 692 When L2 bundle SR extensions [I-D.ietf-isis-l2bundles] are used, an 693 adjacency segment may be advertised to represent the bundle. In this 694 case, an implementation can deduce that load balancing is expected on 695 the LSR advertising this segment and could then decide to place ELI/ 696 ELs to favor this LSR. 698 7.2.3. Maximizing number of LSRs that will loadbalance 700 When placing ELI/ELs, an implementation may try to maximize the 701 number of LSRs that LSRs that both needs to load balance (i.e. have 702 ECMP paths) and that will be able to perform load balancing (i.e. EL 703 label is within their ERLD). 705 Let's consider a path from PE1 to PE2 using the following stack 706 pushed by PE1: {Service_label, Adj_PE2P9, Node_P9, Adj_P1P2}. All 707 routers have ERLD of 10, expect P1 and P2 which have a ERLD of 4. 708 PE1 is able to push 6 labels, so only a single ELI/EL can be added. 710 In the example above, adding ELI/EL next to Adj_P1P2 will only allow 711 load balancing at P1 while inserting it next to Adj_PE2P9, will allow 712 load balancing at P2,P3 ... P9 so maximizing the number of LSRs that 713 could perform load balancing. 715 7.2.4. Preference for a part of the path 717 An implementation may propose to favor a part of the end-to-end path 718 when the number of EL/ELI that can be pushed is not enough to cover 719 the entire path. As example, a service provider may want to favor 720 load balancing at the beginning of the path or at the end of path, so 721 the implementation should prefer putting the ELI/ELs near the top or 722 near of the bottom of the stack. 724 7.2.5. Combining criteria 726 An implementation can combine multiple criteria to determine the best 727 EL/ELIs placement. But combining too much criteria may lead to 728 implementation complexity and high control plane resource 729 consumption. Each time, the network topology will change, a new 730 evaluation of the EL/ELI placement will be necessary. 732 8. A simple algorithm example 734 A simple implementation can only take into account ERLD when placing 735 ELI/EL while keep minimizing the number of EL/ELIs inserted and 736 maximizing the number of LSRs that can loadbalance. 738 The algorithm example is based on the following considerations: 740 o An LSR that is limited in the number of pairs that it 741 can insert SHOULD insert such pairs deeper in the stack. 743 o An LSR should try to insert pairs at positions so that 744 for the maximum number of transit LSRs, the EL occurs within the 745 ERLD of the incoming packet to that LSR. 747 o An LSR should try to insert the minimum number of such pairs while 748 trying to satisfy the above criteria. 750 The pseudocode of the example is shown below. 752 Initialize the current EL insertion point to the 753 bottommost label in the stack that is EL-capable 754 while (local-node can push more pairs OR 755 insertion point is not above label stack) { 756 insert an pair below current insertion point 757 move new insertion point up from current insertion point until 758 ((last inserted EL is below the ERLD) AND (ERLD > 2) 759 AND 760 (new insertion point is EL-capable)) 761 set current insertion point to new insertion point 762 } 764 Figure 7: Example algorithm to insert pairs in a label 765 stack 767 When this algorithm is applied to the example described in Section 3 768 it will result in ELs being inserted in two positions, one below the 769 label L_N-D and another below L_N-P3. Thus the resulting label stack 770 would be {L_N-P3, ELI, EL, L_A-L1, L_N-D, ELI, EL} 772 9. Deployment Considerations 774 As long as LSR node dataplane capabilities with be limited (number of 775 labels that can be pushed, or number of labels that can be 776 inspected), hop-by-hop load balancing of SPRING encapsulated flows 777 will require trade-offs. 779 Entropy label is still a good and usable solution as it allows load 780 balancing without having to perform a deep packet inspection on each 781 LSR: it does not sound reasonable to have a LSR inspecting UDP port 782 within a GRE tunnel carried over a 15 labels SPRING tunnel. 784 Due to the limited capacity of reading a deep stack of MPLS labels, 785 multiple EL/ELIs may be required within the stack which directly 786 impacts the capacity of the head-end to push a deep stack: each EL/ 787 ELI inserted requiring two additional labels to be pushed. 789 Placement strategies of EL/ELIs are so required to find the best 790 trade-off. Multiple criteria may be taken into account and some 791 level of customization (by user) may be required to accommodate the 792 different deployments. Analyzing the path of each destination to 793 determine the best EL/ELI placement may be time consuming for the 794 control plane, we encourage implementations to find the best trade- 795 off between simplicity, resource consumption and load balancing 796 efficiency. 798 In future, hardware and software capacity may increase dataplane 799 capabilities and may be remove some of those limits: this may 800 increase the capacity of load balancing using entropy labels. 802 10. Options considered 804 Different options that were considered to arrive at the recommended 805 solution are documented in this section. 807 10.1. Single EL at the bottom of the stack of tunnels 809 In this option a single EL is used for the entire label stack. The 810 source LSR S encodes the entropy label (EL) at the bottom of the 811 label stack. In the example described in Section 3, it will result 812 in the label stack at LSR S to look like {L_N-P3, L_A-L1, L_N-D, ELI, 813 EL} {remaining packet header}. Note that the notation in [RFC6790] 814 is used to describe the label stack. An issue with this approach is 815 that as the label stack grows due an increase in the number of SIDs, 816 the EL goes correspondingly deeper in the label stack. Hence transit 817 LSRs have to access a larger number of bytes in the packet header 818 when making forwarding decisions. In the example described in 819 Section 3, the LSR P1 would poorly load-balance traffic on the 820 parallel links L3, L4 since the EL is below the RLD of the packet 821 received by P1. A load balanced network design using this approach 822 must ensure that all intermediate LSRs have the capability to 823 traverse the maximum label stack depth as required for that 824 application that uses source routed stacking. 826 In the case where the hardware is capable of pushing a single pair at any depth, this option is the same as the recommended 828 solution in Section 7. 830 This option was rejected since there exist a number of hardware 831 implementations which have a low maximum readable label depth. 832 Choosing this option can lead to a loss of load-balancing using EL in 833 a significant part of the network but that is a critical requirement 834 in a service provider network. 836 10.2. An EL per tunnel in the stack 838 In this option each tunnel in the stack can be given its own EL. The 839 source LSR pushes an before pushing a tunnel label when 840 load balancing is required to direct traffic on that tunnel. In the 841 example described in Section 3, the source LSR S encoded label stack 842 would be {L_N-P3, ELI, EL, L_A-L1, L_N-D, ELI, EL} where all the ELs 843 can be the same. Accessing the EL at an intermediate LSR is 844 independent of the depth of the label stack and hence independent of 845 the specific application that uses source routed tunnels with label 846 stacking in that network. A drawback is that the depth of the label 847 stack grows significantly, almost 3 times as the number of labels in 848 the label stack. The network design should ensure that source LSRs 849 should have the capability to push such a deep label stack. Also, 850 the bandwidth overhead and potential MTU issues of deep label stacks 851 should be accounted for in the network design. 853 In the case where the RLD is the minimum value (3) for all LSRs, all 854 LSRs are EL capable and the LSR that is inserting pairs has 855 no limit on how many it can insert then this option is the same as 856 the recommended solution in Section 7. 858 This option was rejected due to the existence of hardware 859 implementations that can push a limited number of labels on the label 860 stack. Choosing this option would result in a hardware requirement 861 to push two additional labels per tunnel label. Hence it would 862 restrict the number of tunnels that can be stacked in a LSP and hence 863 constrain the types of LSPs that can be created. This was considered 864 unacceptable. 866 10.3. A re-usable EL for a stack of tunnels 868 In this option an LSR that terminates a tunnel re-uses the EL of the 869 terminated tunnel for the next inner tunnel. It does this by storing 870 the EL from the outer tunnel when that tunnel is terminated and re- 871 inserting it below the next inner tunnel label during the label swap 872 operation. The LSR that stacks tunnels should insert an EL below the 873 outermost tunnel. It should not insert ELs for any inner tunnels. 874 Also, the penultimate hop LSR of a segment must not pop the ELI and 875 EL even though they are exposed as the top labels since the 876 terminating LSR of that segment would re-use the EL for the next 877 segment. 879 In Section 3 above, the source LSR S encoded label stack would be 880 {L_N-P3, ELI, EL, L_A-L1, L_N-D}. At P1 the outgoing label stack 881 would be {L_N-P3, ELI, EL, L_A-L1, L_N-D} after it has load balanced 882 to one of the links L3 or L4. At P3 the outgoing label stack would 883 be {L_N-D, ELI, EL}. At P2 the outgoing label stack would be {L_N-D, 884 ELI, EL} and it would load balance to one of the nexthop LSRs P4 or 885 P5. Accessing the EL at an intermediate LSR (e.g. P1) is 886 independent of the depth of the label stack and hence independent of 887 the specific use-case to which the label stack is applied. 889 This option was rejected due to the significant change in label swap 890 operations that would be required for existing hardware. 892 10.4. EL at top of stack 894 A slight variant of the re-usable EL option is to keep the EL at the 895 top of the stack rather than below the tunnel label. In this case 896 each LSR that is not terminating a segment should continue to keep 897 the received EL at the top of the stack when forwarding the packet 898 along the segment. An LSR that terminates a segment should use the 899 EL from the terminated segment at the top of the stack when 900 forwarding onto the next segment. 902 This option was rejected due to the significant change in label swap 903 operations that would be required for existing hardware. 905 10.5. ELs at readable label stack depths 907 In this option the source LSR inserts ELs for tunnels in the label 908 stack at depths such that each LSR along the path that must load 909 balance is able to access at least one EL. Note that the source LSR 910 may have to insert multiple ELs in the label stack at different 911 depths for this to work since intermediate LSRs may have differing 912 capabilities in accessing the depth of a label stack. The label 913 stack depth access value of intermediate LSRs must be known to create 914 such a label stack. How this value is determined is outside the 915 scope of this document. This value can be advertised using a 916 protocol such as an IGP. 918 Applying this method to the example in Section 3 above, if LSR P1 919 needs to have the EL within a depth of 4, then the source LSR S 920 encoded label stack would be {L_N-P3, ELI, EL, L_A-L1, L_N-D, ELI, 921 EL} where all the ELs would typically have the same value. 923 In the case where the RLD has different values along the path and the 924 LSR that is inserting pairs has no limit on how many pairs 925 it can insert, and it knows the appropriate positions in the stack 926 where they should be inserted, then this option is the same as the 927 recommended solution in Section 7. 929 Note that a refinement of this solution which balances the number of 930 pushed labels against the desired entropy is the solution described 931 in Section 7. 933 11. Acknowledgements 935 The authors would like to thank John Drake, Loa Andersson, Curtis 936 Villamizar, Greg Mirsky, Markus Jork, Kamran Raza, Carlos Pignataro, 937 Bruno Decraene and Nobo Akiya for their review comments and 938 suggestions. 940 12. Contributors 942 Xiaohu Xu 943 Huawei 945 Email: xuxiaohu@huawei.com 947 Wim Hendrickx 948 Nokia 950 Email: wim.henderickx@nokia.com 952 Gunter Van De Velde 953 Nokia 955 Email: gunter.van_de_velde@nokia.com 957 Acee Lindem 958 Cisco 960 Email: acee@cisco.com 962 13. IANA Considerations 964 This memo includes no request to IANA. Note to RFC Editor: Remove 965 this section before publication. 967 14. Security Considerations 969 This document does not introduce any new security considerations 970 beyond those already listed in [RFC6790]. 972 15. References 974 15.1. Normative References 976 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 977 Requirement Levels", BCP 14, RFC 2119, 978 DOI 10.17487/RFC2119, March 1997, 979 . 981 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 982 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 983 RFC 6790, DOI 10.17487/RFC6790, November 2012, 984 . 986 [RFC7855] Previdi, S., Ed., Filsfils, C., Ed., Decraene, B., 987 Litkowski, S., Horneffer, M., and R. Shakir, "Source 988 Packet Routing in Networking (SPRING) Problem Statement 989 and Requirements", RFC 7855, DOI 10.17487/RFC7855, May 990 2016, . 992 15.2. Informative References 994 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 995 Hierarchy with Generalized Multi-Protocol Label Switching 996 (GMPLS) Traffic Engineering (TE)", RFC 4206, 997 DOI 10.17487/RFC4206, October 2005, 998 . 1000 [RFC7325] Villamizar, C., Ed., Kompella, K., Amante, S., Malis, A., 1001 and C. Pignataro, "MPLS Forwarding Compliance and 1002 Performance Requirements", RFC 7325, DOI 10.17487/RFC7325, 1003 August 2014, . 1005 [I-D.ietf-spring-segment-routing] 1006 Filsfils, C., Previdi, S., Decraene, B., Litkowski, S., 1007 and R. Shakir, "Segment Routing Architecture", draft-ietf- 1008 spring-segment-routing-11 (work in progress), February 1009 2017. 1011 [I-D.ietf-isis-mpls-elc] 1012 Xu, X., Kini, S., Sivabalan, S., Filsfils, C., and S. 1013 Litkowski, "Signaling Entropy Label Capability Using IS- 1014 IS", draft-ietf-isis-mpls-elc-02 (work in progress), 1015 October 2016. 1017 [I-D.ietf-ospf-mpls-elc] 1018 Xu, X., Kini, S., Sivabalan, S., Filsfils, C., and S. 1019 Litkowski, "Signaling Entropy Label Capability Using 1020 OSPF", draft-ietf-ospf-mpls-elc-04 (work in progress), 1021 November 2016. 1023 [I-D.ietf-isis-l2bundles] 1024 Ginsberg, L., Bashandy, A., Filsfils, C., Nanduri, M., and 1025 E. Aries, "Advertising L2 Bundle Member Link Attributes in 1026 IS-IS", draft-ietf-isis-l2bundles-04 (work in progress), 1027 April 2017. 1029 Authors' Addresses 1031 Sriganesh Kini 1033 EMail: sriganeshkini@gmail.com 1035 Kireeti Kompella 1036 Juniper 1038 EMail: kireeti@juniper.net 1040 Siva Sivabalan 1041 Cisco 1043 EMail: msiva@cisco.com 1045 Stephane Litkowski 1046 Orange 1048 EMail: stephane.litkowski@orange.com 1050 Rob Shakir 1051 Google 1053 EMail: rjs@rob.sh 1055 Jeff Tantsura 1057 EMail: jefftant@gmail.com