idnits 2.17.1 draft-ietf-mpls-spring-entropy-label-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 17, 2017) is 2382 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: '1000' on line 372 -- Looks like a reference, but probably isn't: '1999' on line 372 == Unused Reference: 'RFC4206' is defined on line 1001, but no explicit reference was found in the text == Unused Reference: 'RFC7325' is defined on line 1007, but no explicit reference was found in the text == Outdated reference: A later version (-15) exists of draft-ietf-spring-segment-routing-12 == Outdated reference: A later version (-22) exists of draft-ietf-spring-segment-routing-mpls-10 == Outdated reference: A later version (-13) exists of draft-ietf-isis-mpls-elc-02 == Outdated reference: A later version (-15) exists of draft-ietf-ospf-mpls-elc-04 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group S. Kini 3 Internet-Draft 4 Intended status: Informational K. Kompella 5 Expires: April 20, 2018 Juniper 6 S. Sivabalan 7 Cisco 8 S. Litkowski 9 Orange 10 R. Shakir 11 Google 12 J. Tantsura 13 October 17, 2017 15 Entropy label for SPRING tunnels 16 draft-ietf-mpls-spring-entropy-label-07 18 Abstract 20 Segment Routing (SR) leverages the source routing paradigm. A node 21 steers a packet through an ordered list of instructions, called 22 segments. Segment Routing can be applied to the Multi Protocol Label 23 Switching (MPLS) data plane. Entropy label (EL) is a technique used 24 in MPLS to improve load-balancing. This document examines and 25 describes how ELs are to be applied to Segment Routing when applied 26 to the MPLS dataplane. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on April 20, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 63 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 64 2. Abbreviations and Terminology . . . . . . . . . . . . . . . . 4 65 3. Use-case requiring multipath load-balancing . . . . . . . . . 4 66 4. Entropy Readable Label Depth . . . . . . . . . . . . . . . . 5 67 5. Maximum SID Depth . . . . . . . . . . . . . . . . . . . . . . 7 68 6. LSP stitching using the binding SID . . . . . . . . . . . . . 8 69 7. Insertion of entropy labels for SPRING path . . . . . . . . . 10 70 7.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10 71 7.1.1. Example 1 where the ingress node has a sufficient MSD 11 72 7.1.2. Example 2 where the ingress node has not a sufficient 73 MSD . . . . . . . . . . . . . . . . . . . . . . . . . 12 74 7.2. Considerations for the placement of entropy labels . . . 12 75 7.2.1. ERLD value . . . . . . . . . . . . . . . . . . . . . 13 76 7.2.2. Segment type . . . . . . . . . . . . . . . . . . . . 14 77 7.2.2.1. Node-SID . . . . . . . . . . . . . . . . . . . . 14 78 7.2.2.2. Adjacency-set SID . . . . . . . . . . . . . . . . 15 79 7.2.2.3. Adjacency-SID representing a single IP link . . . 15 80 7.2.2.4. Adjacency-SID representing a single link within a 81 L2 bundle . . . . . . . . . . . . . . . . . . . . 15 82 7.2.2.5. Adjacency-SID representing a L2 bundle . . . . . 15 83 7.2.3. Maximizing number of LSRs that will load-balance . . 15 84 7.2.4. Preference for a part of the path . . . . . . . . . . 16 85 7.2.5. Combining criteria . . . . . . . . . . . . . . . . . 16 86 8. A simple example algorithm . . . . . . . . . . . . . . . . . 16 87 9. Deployment Considerations . . . . . . . . . . . . . . . . . . 17 88 10. Options considered . . . . . . . . . . . . . . . . . . . . . 18 89 10.1. Single EL at the bottom of the stack . . . . . . . . . . 18 90 10.2. An EL per segment in the stack . . . . . . . . . . . . . 18 91 10.3. A re-usable EL for a stack of tunnels . . . . . . . . . 19 92 10.4. EL at top of stack . . . . . . . . . . . . . . . . . . . 19 93 10.5. ELs at readable label stack depths . . . . . . . . . . . 20 94 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 95 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 20 96 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 97 14. Security Considerations . . . . . . . . . . . . . . . . . . . 21 98 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 99 15.1. Normative References . . . . . . . . . . . . . . . . . . 21 100 15.2. Informative References . . . . . . . . . . . . . . . . . 22 101 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 23 103 1. Introduction 105 Segment Routing [I-D.ietf-spring-segment-routing] is based on source 106 routed tunnels to steer a packet along a particular path. This path 107 is encoded as an ordered list of segments. When applied to the MPLS 108 dataplane [I-D.ietf-spring-segment-routing-mpls], each segment is an 109 LSP with an associated MPLS label value. Hence, label stacking is 110 used to represent the ordered list of segments and the label stack 111 associated with an SR tunnel can be seen as nested LSPs (LSP 112 hierarchy) in the MPLS architecture. 114 Using label stacking to encode the list of segment has implications 115 on the label stack depth. 117 Entropy label (EL) [RFC6790] is a technique used in the MPLS data 118 plane to provide entropy for load-balancing. When using LSP 119 hierarchies, there are implications on how [RFC6790] should be 120 applied. The current document addresses the case where a hierarchy 121 is created at a single LSR as required by Segment Routing. 123 A use-case requiring load-balancing with SR is given in Section 3. A 124 recommended solution is described in Section 7 keeping in 125 consideration the limitations of implementations when applying 126 [RFC6790] to deeper label stacks. Options that were considered to 127 arrive at the recommended solution are documented for historical 128 purposes in Section 10. 130 1.1. Requirements Language 132 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 133 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 134 document are to be interpreted as described in [RFC2119]. 136 Although this document is not a protocol specification, the use of 137 this language clarifies the instructions to protocol designers 138 producing solutions that satisfy the requirements set out in this 139 document. 141 2. Abbreviations and Terminology 143 EL - Entropy Label 145 ELI - Entropy Label Identifier 147 ELC - Entropy Label Capability 149 ERLD - Entropy Readable Label Depth 151 SR - Segment Routing 153 ECMP - Equal Cost Multi Path 155 LSR - Label Switch Router 157 MPLS - Multiprotocol Label Switching 159 MSD - Maximum SID Depth 161 SID - Segment Identifier 163 RLD - Readable Label Depth 165 OAM - Operation, Administration and Maintenance 167 3. Use-case requiring multipath load-balancing 169 +------+ 170 | | 171 +-------| P3 |-----+ 172 | +-----| |---+ | 173 L3| |L4 +------+ L1| |L2 +----+ 174 | | | | +--| P4 |--+ 175 +-----+ +-----+ +-----+ | +----+ | +-----+ 176 | S |-----| P1 |------------| P2 |--+ +--| D | 177 | | | | | |--+ +--| | 178 +-----+ +-----+ +-----+ | +----+ | +-----+ 179 +--| P5 |--+ 180 +----+ 181 S=Source LSR, D=Destination LSR, P1,P2,P3,P4,P5=Transit LSRs, 182 L1,L2,L3,L4=Links 184 Figure 1: Traffic engineering use-case 186 Traffic-engineering is one of the applications of MPLS and is also a 187 requirement for source routed tunnels with label stacks [RFC7855]. 189 Consider the topology shown in Figure 1. The LSR S requires data to 190 be sent to LSR D along a traffic-engineered path that goes over the 191 link L1. Good load-balancing is also required across equal cost 192 paths (including parallel links). To engineer traffic along a path 193 that takes link L1, the label stack that LSR S creates consists of a 194 label to the node SID of LSR P3, stacked over the label for the 195 adjacency SID of link L1 and that in turn is stacked over the label 196 to the node SID of LSR D. For simplicity lets assume that all LSRs 197 use the same label space (SRGB) for source routed label stacks. Let 198 L_N-Px denote the label to be used to reach the node SID of LSR Px. 199 Let L_A-Ln denote the label used for the adjacency SID for link Ln. 200 The LSR S must use the label stack for 201 traffic-engineering. However to achieve good load-balancing over the 202 equal cost paths P2-P4-D, P2-P5-D and the parallel links L3, L4, a 203 mechanism such as Entropy labels [RFC6790] should be adapted for 204 source routed label stacks. Indeed, the SPRING architecture with the 205 MPLS dataplane ([I-D.ietf-spring-segment-routing-mpls]) uses nested 206 MPLS LSPs composing the source routed label stacks. As each MPLS 207 node may have limitations in the number of labels it can push when it 208 is ingress or inspect when doing load-balancing, an entropy label 209 insertion strategy becomes important to keep the benefit of the load- 210 balancing. Multiple ways to apply entropy labels were considered and 211 are documented in Section 10 along with their trade-offs. A 212 recommended solution is described in Section 7. 214 4. Entropy Readable Label Depth 216 The Entropy Readable Label Depth (ERLD) is defined as the number of 217 labels a router can both: 219 a. Read in an MPLS packet received on its incoming interface(s) 220 (starting from the top of the stack). 222 b. Use in its load-balancing function. 224 The ERLD means that the router will perform load-balancing using the 225 EL label if the EL is placed within the ERLD first labels. 227 A router capable of reading N labels but not using an EL located 228 within those N labels MUST consider its ERLD to be 0. In a 229 distributed switching architecture, each linecard may have a 230 different capability in terms of ERLD. For simplicity, an 231 implementation MAY use the minimum ERLD between each linecard as the 232 ERLD value for the system. 234 Examples: 236 | Payload | 237 +----------+ 238 | Payload | | EL | P7 239 +----------+ +----------+ 240 | Payload | | EL | | ELI | 241 +----------+ +----------+ +----------+ 242 | Payload | | EL | | ELI | | Label 50 | 243 +----------+ +----------+ +----------+ +----------+ 244 | Payload | | EL | | ELI | | Label 40 | | Label 40 | 245 +----------+ +----------+ +----------+ +----------+ +----------+ 246 | EL | | ELI | | Label 30 | | Label 30 | | Label 30 | 247 +----------+ +----------+ +----------+ +----------+ +----------+ 248 | ELI | | Label 20 | | Label 20 | | Label 20 | | Label 20 | 249 +----------+ +----------+ +----------+ +----------+ +----------+ 250 | Label 16 | | Label 16 | | Label 16 | | Label 16 | | Label 16 | P1 251 +----------+ +----------+ +----------+ +----------+ +----------+ 252 Packet 1 Packet 2 Packet 3 Packet 4 Packet 5 254 Figure 2: Label stacks with ELI/EL 256 In the figure 2, we consider the displayed packets received on a 257 router interface. We consider also a single ERLD value for the 258 router. 260 o If the router has an ERLD of 3, it will be able to load-balance 261 Packet 1 displayed in Figure 2 using the EL as part of the load- 262 balancing keys. The ERLD value of 3 means that the router can 263 read and take into account the entropy label for load-balancing if 264 it is placed between position 1 (top) and position 3. 266 o If the router has an ERLD of 5, it will be able to load-balance 267 Packets 1 to 3 in Figure 2 using the EL as part of the load- 268 balancing keys. Packets 4 and 5 have the EL placed at a position 269 greater than 5, so the router is not able to read it and use as 270 part of the load-balancing keys. 272 o If the router has an ERLD of 10, it will be able to load-balance 273 all the packets displayed in Figure 2 using the EL as part of the 274 load-balancing keys. 276 To allow an efficient load-balancing based on entropy labels, a 277 router running SPRING SHOULD advertise its ERLD (or ERLDs), so all 278 the other SPRING routers in the network are aware of its capability. 279 How this advertisement is done is outside the scope of this document. 281 To advertise an ERLD value, a SPRING router: 283 o MUST be entropy label capable and, as a consequence, MUST apply 284 the dataplane procedures defined in [RFC6790]. 286 o MUST be able to read an ELI/EL which is located within its ERLD 287 value. 289 o MUST take into account this EL in its load-balancing function. 291 5. Maximum SID Depth 293 The Maximum SID Depth defines the maximum number of labels that a 294 particular node can impose on a packet. This includes any kind of 295 labels (service, entropy, transport...). In an MPLS network, the MSD 296 is a limit of the Ingress LSR (I-LSR) or any stitching node that 297 would perform an imposition of additional labels on an existing label 298 stack. 300 Depending of the number of MPLS operations (POP, SWAP...) to be 301 performed before the PUSH, the MSD may vary due to the hardware or 302 software limitations. As for the ERLD, there may also be different 303 MSD limits based on the linecard type used in a distributed switching 304 system. 306 When an external controller is used to program a label stack on a 307 particular node, this node MAY advertise its MSD value or a subset of 308 its MSD value to the controller. How this advertisement is done is 309 outside the scope of this document. As the controller does not have 310 the knowledge of the entire label stack to be pushed by the node, the 311 node may advertise an MSD value which is lower than its actual limit. 312 This gives the ability for the controller to program a label stack up 313 to the advertised MSD value while leaving room for the local node to 314 add more labels (e.g., service, entropy, transport...) without 315 reaching the hardware/software limit. 317 P7 ---- P8 ---- P9 318 / \ 319 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- PE2 320 | \ | 321 ----> P10 \ | 322 IP Pkt | \ | 323 P11 --- P12 --- P13 324 100 10000 326 Figure 3 328 In the figure 3, an IP packet comes in the MPLS network at PE1. All 329 metrics are considered equal to 1 except P12-P13 which is 10000 and 330 P11-P12 which is 100. PE1 wants to steer the traffic using a SPRING 331 path to PE2 along 332 PE1->P1->P7->P8->P9->P4->P5->P10->P11->P12->P13->PE2. By using 333 adjacency SIDs only, PE1 (acting as an I-LSR) will be required to 334 push 10 labels on the IP packet received and thus requires an MSD of 335 10. If the IP packet should be carried over an MPLS service like a 336 regular layer 3 VPN, an additional service label should be imposed, 337 requiring an MSD of 11 for PE1. In addition, if PE1 wants to insert 338 an ELI/EL for load-balancing purpose, PE1 will need to push 13 labels 339 on the IP packet requiring an MSD of 13. 341 In the SPRING architecture, Node SIDs or Binding SIDs can be used to 342 reduce the label stack size. As an example, to steer the traffic on 343 the same path as before, PE1 may be able to use the following label 344 stack: . In this example we 345 consider a combination of Node SIDs and a Binding SID advertised by 346 P5 that will stitch the traffic along the path P10->P11->P12->P13. 347 The instruction associated with the binding SID at P5 is thus to swap 348 Binding_P5 to Adj_P12-P13 and then push . P5 349 acts as a stitching node that pushes additional labels on an existing 350 label stack, P5's MSD needs also to be taken into account and may 351 limit the number of labels that could be imposed. 353 6. LSP stitching using the binding SID 355 The binding SID allows binding a segment identifier to an existing 356 LSP. As examples, the binding SID can represent an RSVP-TE tunnel, 357 an LDP path (through the mapping server advertisement), or a SPRING 358 path. Each LSP associated with a binding SID has its own entropy 359 label capability. 361 In the figure 3, we consider that: 363 o P6, PE2, P10, P11, P12, P13 are pure LDP routers. 365 o PE1, P1, P2, P3, P4, P7, P8, P9 are pure SPRING routers. 367 o P5 is running SPRING and LDP. 369 o P5 acts as a mapping server and advertises Prefix SIDs for the LDP 370 FECs: an index value of 20 is used for PE2. 372 o All SPRING routers use an SRGB of [1000, 1999]. 374 o P6 advertises label 20 for the PE2 FEC. 376 o Traffic from PE1 to PE2 uses the shortest path. 378 PE1 ----- P1 -- P2 -- P3 -- P4 ---- P5 --- P6 --- PE2 380 --> +----+ +----+ +----+ +----+ 381 IP Pkt | IP | | IP | | IP | | IP | 382 +----+ +----+ +----+ +----+ 383 |1020| |1020| | 20 | 384 +----+ +----+ +----+ 385 SPRING LDP 387 In term of packet forwarding, by learning the mapping-server 388 advertisement from PE5, PE1 imposes a label 1020 to an IP packet 389 destinated to PE2. SPRING routers along the shortest path to PE2 390 will switch the traffic until it reaches P5 which will perform the 391 LSP stitching. P5 will swap the SPRING label 1020 to the LDP label 392 20 advertised by the nexthop P6. P6 will then forward the packet 393 using the LDP label towards PE2. 395 PE1 cannot push an ELI/EL for the binding SID without knowing that 396 the tail-end of the LSP associated with the binding (PE2) is entropy 397 label capable. 399 To accomodate the mix of signalling protocols involved during the 400 stitching, the entropy label capability SHOULD be propagated between 401 the signalling protocols. Each binding SID SHOULD have its own 402 entropy label capability that MUST be inherited from the entropy 403 label capability of the associated LSP. If the router advertising 404 the binding SID does not know the ELC state of the target FEC, it 405 MUST NOT set the ELC for the binding SID. An ingress node MUST NOT 406 push an ELI/EL associated with a binding SID unless this binding SID 407 has the entropy label capability. How the entropy label capability 408 is advertised for a binding SID is outside the scope of this 409 document. 411 In our example, if PE2 is LDP entropy label capable, it will add the 412 entropy label capability in its LDP advertisement. When P5 receives 413 the FEC/label binding for PE2, it learns about the ELC and can set 414 the ELC in the mapping server advertisement. Thus PE1 learns about 415 the ELC of PE2 and may push an ELI/EL associated with the binding 416 SID. 418 The proposed solution only works if the SPRING router advertising the 419 binding SID is also performing the dataplane LSP stitching. In our 420 example, if the mapping server function is hosted on P8 instead of 421 P5, P8 does not know about the ELC state of PE2's LDP FEC. As a 422 consequence, it does not set the ELC for the associated binding SID. 424 7. Insertion of entropy labels for SPRING path 426 7.1. Overview 428 The solution described in this section follows the dataplane 429 processing defined in [RFC6790]. Within a SPRING path, a node may be 430 ingress, egress, transit (regarding the entropy label processing 431 described in [RFC6790]), or it can be any combination of those. For 432 example: 434 o The ingress node of a SPRING domain may be an ingress node from an 435 entropy label perspective. 437 o Any LSR terminating a segment of the SPRING path is an egress node 438 (because it terminates the segment) but may also be a transit node 439 if the SPRING path is not terminated because there is a subsequent 440 SPRING MPLS label in the stack. 442 o Any LSR processing a binding SID may be a transit node and an 443 ingress node (because it may push additional labels when 444 processing the binding SID). 446 As described earlier, an LSR may have a limitation, ERLD, on the 447 depth of the label stack that it can read and process in order to do 448 multipath load-balancing based on entropy labels. 450 If an EL does not occur within the ERLD of an LSR in the label stack 451 of an MPLS packet that it receives, then it would lead to poor load- 452 balancing at that LSR. Hence an ELI/EL pair must be within the ERLD 453 of the LSR in order for the LSR to use the EL during load-balancing. 455 Adding a single ELI/EL pair for the entire SPRING path may lead also 456 to poor load-balancing as well because the EL/ELI may not occur 457 within the ERLD of some LSR on the path (if too deep) or may not be 458 present in the stack when it reaches some LSRs if it is too shallow. 460 In order for the EL to occur within the ERLD of LSRs along the path 461 corresponding to a SPRING label stack, multiple pairs MAY 462 be inserted in this label stack. 464 The insertion of the ELI/EL SHOULD occur only with a SPRING label 465 advertised by an LSR that advertised an ERLD (the LSR is entropy 466 label capable) or with a SPRING label associated with a binding SID 467 that has the ELC set. 469 The ELs among multiple pairs inserted in the stack MAY be 470 the same or different. The LSR that inserts pairs MAY have 471 limitations on the number of such pairs that it can insert and also 472 the depth at which it can insert them. If, due to limitations, the 473 inserted ELs are at positions such that an LSR along the path 474 receives an MPLS packet without an EL in the label stack within that 475 LSR's ERLD, then the load-balancing performed by that LSR would be 476 poor. An implementation MAY consider multiple criteria when 477 inserting pairs. 479 7.1.1. Example 1 where the ingress node has a sufficient MSD 481 ECMP LAG LAG 482 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- PE2 484 Figure 4 486 In the figure 4, PE1 wants to forward some MPLS VPN traffic over an 487 explicit path to PE2 resulting in the following label stack to be 488 pushed onto the received IP header: . PE1 is limited 490 to push a maximum of 11 labels (MSD=11). P2, P3 and P6 have an ERLD 491 of 3 while others have an ERLD of 10. 493 PE1 can only add two ELI/EL pairs in the label stack due to its MSD 494 limitation. It should insert them strategically to benefit load- 495 balancing along the longest part of the path. 497 PE1 may take into account multiple parameters when inserting ELs, as 498 examples: 500 o The ERLD value advertised by transit nodes. 502 o The requirement of load-balancing for a particular label value. 504 o Any service provider preference: favor beginning of the path or 505 end of the path. 507 In the figure 4, a good strategy may be to use the following stack 508 . The original stack requests P2 to forward 510 based on a L3 adjacency set that will require load-balancing. 511 Therefore it is important to ensure that P2 can load-balance 512 correctly. As P2 has a limited ERLD of 3, ELI/EL must be inserted 513 just next to the label that P2 will use to forward. On the path to 514 PE2, P3 has also a limited ERLD, but P3 will forward based on a basic 515 adjacency segment that may require no load-balancing. Therefore it 516 does not seem important to ensure that P3 can do load-balancing 517 despite of its limited ERLD. The next nodes along the forwarding 518 path have a high ERLD that does not cause any issue, except P6, 519 moreover P6 is using some LAGs to PE2 and so is expected to load- 520 balance. It becomes important to insert a new ELI/EL just next to P6 521 forwarding label. 523 In the case above, the ingress node had enough label push capacity to 524 ensure end-to-end load-balancing taking into the path attributes. 525 There might be some cases, where the ingress node may not have the 526 necessary label imposition capacity. 528 7.1.2. Example 2 where the ingress node has not a sufficient MSD 530 ECMP LAG ECMP ECMP 531 PE1 --- P1 --- P2 --- P3 --- P4 --- P5 --- P6 --- P7 --- P8 --- PE2 533 Figure 5 535 In the figure 5, PE1 wants to forward MPLS VPN traffic over an 536 explicit path to PE2 resulting in the following label stack to be 537 pushed onto the IP header: . PE1 is limited to push a maximum of 11 labels, P2, P3 540 and P6 have an ERLD of 3 while others have an ERLD of 15. 542 Using a similar strategy as the previous case may lead to a dilemma, 543 as PE1 can only push a single ELI/EL while we may need a minimum of 544 three to load-balance the end-to-end path. An optimized stack that 545 would enable end-to-end load-balancing may be: . 549 A decision needs to be taken to favor some part of the path for load- 550 balancing considering that load-balancing may not work on the other 551 part. A service provider may decide to place the ELI/EL after the P6 552 forwarding label as it will allow P4 and P6 to load-balance. Placing 553 the ELI/EL at bottom of the stack is also a possibility enabling 554 load-balancing for P4 and P8. 556 7.2. Considerations for the placement of entropy labels 558 The sample cases described in the previous section showed that 559 placing the ELI/EL when the maximum number of labels to be pushed is 560 limited is not an easy decision and multiple criteria may be taken 561 into account. 563 This section describes some considerations that could be taken into 564 account when placing ELI/ELs. This list of criteria is not 565 considered as exhaustive and an implementation MAY take into account 566 additional criteria or tie-breakers that are not documented here. 568 An implementation SHOULD try to maximize the load-balancing where 569 multiple ECMP paths are available and minimize the number of EL/ELIs 570 that need to be inserted. In case of a trade-off, an implementation 571 MAY provide flexibility to the operator to select the criteria to be 572 considered when placing EL/ELIs or the sub-objective for which to 573 optimize. 575 2 2 576 PE1 -- P1 -- P2 --P3 --- P4 --- P5 -- ... -- P8 -- P9 -- PE2 577 | | 578 P3'--- P4'--- P5' 580 Figure 6 582 The figure above will be used as reference in the following 583 subsections. All metrics are equal to 1, except P3-P4 and P4-P5 584 which have a metric 2. 586 7.2.1. ERLD value 588 As mentioned in Section 7.1, the ERLD value is an important parameter 589 to consider when inserting ELI/EL. If an ELI/EL does not fall within 590 the ERLD of a node on the path, the node will not be able to load- 591 balance the traffic efficiently. 593 The ERLD value can be advertised via protocols and those extensions 594 are described in separate documents [I-D.ietf-isis-mpls-elc] and 595 [I-D.ietf-ospf-mpls-elc]. 597 Let's consider a path from PE1 to PE2 using the following stack 598 pushed by PE1: . 600 Using the ERLD as an input parameter may help to minimize the number 601 of required ELI/EL pairs to be inserted. An ERLD value must be 602 retrieved for each SPRING label in the label stack. 604 For a label bound to an adjacency segment, the ERLD is the ERLD of 605 the node that advertised the adjacency segment. In the example 606 above, the ERLD associated with Adj_P1P2 would be the ERLD of router 607 P1 as P1 will perform the forwarding based on the Adj_P1P2 label. 609 For a label bound to a node segment, multiple strategies MAY be 610 implemented. An implementation may try to evaluate the minimum ERLD 611 value along the node segment path. If an implementation cannot find 612 the minimum ERLD along the path of the segment, it can use the ERLD 613 of the starting node instead. In the example above, if the 614 implementation supports computation of minimum ERLD along the path, 615 the ERLD associated with label Node_P9 would be the minimum ERLD 616 between nodes {P2,P3,P4 ..., P8}. If an implementation does not 617 support the computation of minimum ERLD, it should consider the ERLD 618 of P2 (starting node that will forward based on the Node_P9 label). 620 For a label bound to a binding segment, if the binding segment 621 describes a path, an implementation may also try to evaluate the 622 minimum ERLD along this path. If the implementation cannot find the 623 minimum ERLD along the path of the segment, it can use the ERLD of 624 the starting node instead. 626 7.2.2. Segment type 628 Depending of the type of segment a particular label is bound to, an 629 implementation may deduce that this particular label will be subject 630 to load-balancing on the path. 632 7.2.2.1. Node-SID 634 An MPLS label bound to a Node-SID represents a path that may cross 635 multiple hops. Load-balancing may be needed on the node starting 636 this path but also on any node along the path. 638 In the figure 6, let's consider a path from PE1 to PE2 using the 639 following stack pushed by PE1: . 642 If, for example, PE1 is limited to push 6 labels, it can add a single 643 ELI/EL within the label stack. An operator may want to favor a 644 placement that would allow load-balancing along the Node-SID path. 645 In the figure above, P3 which is along the Node-SID path requires 646 load-balancing on two equal-cost paths. 648 An implementation may try to evaluate if load-balancing is really 649 required within a node segment path. This could be done by running 650 an additional SPT computation and analysis of the node segment path 651 to prevent a node segment that does not really require load-balancing 652 from being preferred when placing EL/ELIs. Such inspection may be 653 time consuming for implementations and without a 100% guarantee, as a 654 node segment path may use LAG that could be invisible from the IP 655 topology. A simpler approach would be to consider that a label bound 656 to a Node-SID will be subject to load-balancing and requires an EL/ 657 ELI. 659 7.2.2.2. Adjacency-set SID 661 An adjacency-set is an adjacency SID that refers to a set of 662 adjacencies. When an adjacency-set segment is used within a label 663 stack, an implementation can deduce that load-balancing is expected 664 at the node that advertised this adjacency segment. An 665 implementation could then favor this particular label value when 666 placing ELI/ELs. 668 7.2.2.3. Adjacency-SID representing a single IP link 670 When an adjacency segment representing a single IP link is used 671 within a label stack, an implementation can deduce that load- 672 balancing may not be expected at the node that advertised this 673 adjacency segment. 675 The implementation could then decide to place ELI/ELs to favor other 676 LSRs than the one advertising this adjacency segment. 678 Readers should note that an adjacency segment representing a single 679 IP link may require load-balancing. This is the case when a LAG (L2 680 bundle) is implemented between two IP nodes and the L2 bundle SR 681 extensions [I-D.ietf-isis-l2bundles] are not implemented. In such a 682 case, it may be useful to insert an EL/ELI in a readable position for 683 the LSR advertising the label associated with the adjacency segment. 685 7.2.2.4. Adjacency-SID representing a single link within a L2 bundle 687 When L2 bundle SR extensions [I-D.ietf-isis-l2bundles] are used, 688 adjacency segments may be advertised for each member of the bundle. 689 In this case, an implementation can deduce that load-balancing is not 690 expected on the LSR advertising this segment and could then decide to 691 place ELI/ELs to favor other LSRs than the one advertising this 692 adjacency segment. 694 7.2.2.5. Adjacency-SID representing a L2 bundle 696 When L2 bundle SR extensions [I-D.ietf-isis-l2bundles] are used, an 697 adjacency segment may be advertised to represent the bundle. In this 698 case, an implementation can deduce that load-balancing is expected on 699 the LSR advertising this segment and could then decide to place ELI/ 700 ELs to favor this LSR. 702 7.2.3. Maximizing number of LSRs that will load-balance 704 When placing ELI/ELs, an implementation may try to maximize the 705 number of LSRs that both need to load-balance (i.e., have ECMP paths) 706 and that will be able to perform load-balancing (i.e., the EL label 707 is within their ERLD). 709 Let's consider a path from PE1 to PE2 using the following stack 710 pushed by PE1: . All 711 routers have an ERLD of 10, expect P1 and P2 which have an ERLD of 4. 712 PE1 is able to push 6 labels, so only a single ELI/EL can be added. 714 In the example above, adding ELI/EL next to Adj_P1P2 will only allow 715 load-balancing at P1 while inserting it next to Adj_PE2P9, will allow 716 load-balancing at P2,P3 ... P9 and maximizing the number of LSRs that 717 could perform load-balancing. 719 7.2.4. Preference for a part of the path 721 An implementation may propose to favor a part of the end-to-end path 722 when the number of EL/ELI that can be pushed is not enough to cover 723 the entire path. As example, a service provider may want to favor 724 load-balancing at the beginning of the path or at the end of path, so 725 the implementation should prefer putting the ELI/ELs near the top or 726 near of the bottom of the stack. 728 7.2.5. Combining criteria 730 An implementation can combine multiple criteria to determine the best 731 EL/ELIs placement. However, combining too many criteria may lead to 732 implementation complexity and high resource consumption. Each time 733 the network topology changes, a new evaluation of the EL/ELI 734 placement will be necessary for each impacted LSPs. 736 8. A simple example algorithm 738 A simple implementation might take into account ERLD when placing 739 ELI/EL while trying to minimize the number of EL/ELIs inserted and 740 trying to maximize the number of LSRs that can load-balance. 742 The example algorithm is based on the following considerations: 744 o An LSR that is limited in the number of pairs that it 745 can insert SHOULD insert such pairs deeper in the stack. 747 o An LSR should try to insert pairs at positions so that 748 for the maximum number of transit LSRs, the EL occurs within the 749 ERLD of those LSRs. 751 o An LSR should try to insert the minimum number of such pairs while 752 trying to satisfy the above criteria. 754 The pseudocode of the example algorithm is shown below. 756 Initialize the current EL insertion point to the 757 bottommost label in the stack that is EL-capable 758 while (local-node can push more pairs OR 759 insertion point is not above label stack) { 760 insert an pair below current insertion point 761 move new insertion point up from current insertion point until 762 ((last inserted EL is below the ERLD) AND (ERLD > 2) 763 AND 764 (new insertion point is EL-capable)) 765 set current insertion point to new insertion point 766 } 768 Figure 7: Example algorithm to insert pairs in a label 769 stack 771 When this algorithm is applied to the example described in Section 3, 772 it will result in ELs being inserted in two positions, one below the 773 label L_N-D and another below L_N-P3. Thus the resulting label stack 774 would be 776 9. Deployment Considerations 778 As long as LSR node dataplane capabilities are limited (number of 779 labels that can be pushed, or number of labels that can be 780 inspected), hop-by-hop load-balancing of SPRING encapsulated flows 781 will require trade-offs. 783 Entropy label is still a good and usable solution as it allows load- 784 balancing without having to perform a deep packet inspection on each 785 LSR: it does not seem reasonable to have an LSR inspecting UDP ports 786 within a GRE tunnel carried over a 15 label SPRING tunnel. 788 Due to the limited capacity of reading a deep stack of MPLS labels, 789 multiple EL/ELIs may be required within the stack which directly 790 impacts the capacity of the head-end to push a deep stack: each EL/ 791 ELI inserted requires two additional labels to be pushed. 793 Placement strategies of EL/ELIs are required to find the best trade- 794 off. Multiple criteria may be taken into account and some level of 795 customization (by the user) may be required to accommodate the 796 different deployments. Analyzing the path of each destination to 797 determine the best EL/ELI placement may be time consuming for the 798 control plane, we encourage implementations to find the best trade- 799 off between simplicity, resource consumption, and load-balancing 800 efficiency. 802 In future, hardware and software capacity may increase dataplane 803 capabilities and may be remove some of these limitations, increasing 804 load-balancing capability using entropy labels. 806 10. Options considered 808 Different options that were considered to arrive at the recommended 809 solution are documented in this section. 811 These options are detailed here only for historical purposes. 813 10.1. Single EL at the bottom of the stack 815 In this option, a single EL is used for the entire label stack. The 816 source LSR S encodes the entropy label at the bottom of the label 817 stack. In the example described in Section 3, it will result in the 818 label stack at LSR S to look like 819 . Note that the notation in [RFC6790] is 820 used to describe the label stack. An issue with this approach is 821 that as the label stack grows due an increase in the number of SIDs, 822 the EL goes correspondingly deeper in the label stack. Hence, 823 transit LSRs have to access a larger number of bytes in the packet 824 header when making forwarding decisions. In the example described in 825 Section 3, if we consider that the LSR P1 has an ERLD of 3, P1 would 826 load-balance traffic poorly on the parallel links L3 and L4 since the 827 EL is below the ERLD of P1. A load-balanced network design using 828 this approach must ensure that all intermediate LSRs have the 829 capability to read the maximum label stack depth as required for the 830 application that uses source routed stacking. 832 This option was rejected since there exist a number of hardware 833 implementations which have a low maximum readable label depth. 834 Choosing this option can lead to a loss of load-balancing using EL in 835 a significant part of the network when that is a critical requirement 836 in a service-provider network. 838 10.2. An EL per segment in the stack 840 In this option, each segment/label in the stack can be given its own 841 EL. When load-balancing is required to direct traffic on a segment, 842 the source LSR pushes an before pushing the label 843 associated to this segment . In the example described in Section 3, 844 the source LSR S encoded label stack would be where all the ELs can be the same. Accessing the 846 EL at an intermediate LSR is independent of the depth of the label 847 stack and hence independent of the specific application that uses 848 source routed tunnels with label stacking. A drawback is that the 849 depth of the label stack grows significantly, almost 3 times as the 850 number of labels in the label stack. The network design should 851 ensure that source LSRs have the capability to push such a deep label 852 stack. Also, the bandwidth overhead and potential MTU issues of deep 853 label stacks should be considered in the network design. 855 This option was rejected due to the existence of hardware 856 implementations that can push a limited number of labels on the label 857 stack. Choosing this option would result in a hardware requirement 858 to push two additional labels per tunnel label. Hence it would 859 restrict the number of tunnels that can be stacked in a LSP and hence 860 constrain the types of LSPs that can be created. This was considered 861 unacceptable. 863 10.3. A re-usable EL for a stack of tunnels 865 In this option an LSR that terminates a tunnel re-uses the EL of the 866 terminated tunnel for the next inner tunnel. It does this by storing 867 the EL from the outer tunnel when that tunnel is terminated and re- 868 inserting it below the next inner tunnel label during the label swap 869 operation. The LSR that stacks tunnels should insert an EL below the 870 outermost tunnel. It should not insert ELs for any inner tunnels. 871 Also, the penultimate hop LSR of a segment must not pop the ELI and 872 EL even though they are exposed as the top labels since the 873 terminating LSR of that segment would re-use the EL for the next 874 segment. 876 In Section 3 above, the source LSR S encoded label stack would be 877 . At P1, the outgoing label stack 878 would be after it has load-balanced 879 to one of the links L3 or L4. At P3 the outgoing label stack would 880 be . At P2, the outgoing label stack would be and it would load-balance to one of the nexthop LSRs P4 882 or P5. Accessing the EL at an intermediate LSR (e.g., P1) is 883 independent of the depth of the label stack and hence independent of 884 the specific use-case to which the label stack is applied. 886 This option was rejected due to the significant change in label swap 887 operations that would be required for existing hardware. 889 10.4. EL at top of stack 891 A slight variant of the re-usable EL option is to keep the EL at the 892 top of the stack rather than below the tunnel label. In this case, 893 each LSR that is not terminating a segment should continue to keep 894 the received EL at the top of the stack when forwarding the packet 895 along the segment. An LSR that terminates a segment should use the 896 EL from the terminated segment at the top of the stack when 897 forwarding onto the next segment. 899 This option was rejected due to the significant change in label swap 900 operations that would be required for existing hardware. 902 10.5. ELs at readable label stack depths 904 In this option the source LSR inserts ELs for tunnels in the label 905 stack at depths such that each LSR along the path that must load 906 balance is able to access at least one EL. Note that the source LSR 907 may have to insert multiple ELs in the label stack at different 908 depths for this to work since intermediate LSRs may have differing 909 capabilities in accessing the depth of a label stack. The label 910 stack depth access value of intermediate LSRs must be known to create 911 such a label stack. How this value is determined is outside the 912 scope of this document. This value can be advertised using a 913 protocol such as an IGP. 915 Applying this method to the example in Section 3 above, if LSR P1 916 needs to have the EL within a depth of 4, then the source LSR S 917 encoded label stack would be where all the ELs would typically have the same value. 920 In the case where the ERLD has different values along the path and 921 the LSR that is inserting pairs has no limit on how many 922 pairs it can insert, and it knows the appropriate positions in the 923 stack where they should be inserted, this option is the same as the 924 recommended solution in Section 7. 926 Note that a refinement of this solution which balances the number of 927 pushed labels against the desired entropy is the solution described 928 in Section 7. 930 11. Acknowledgements 932 The authors would like to thank John Drake, Loa Andersson, Curtis 933 Villamizar, Greg Mirsky, Markus Jork, Kamran Raza, Carlos Pignataro, 934 Bruno Decraene, Chris Bowers and Nobo Akiya for their review comments 935 and suggestions. 937 12. Contributors 938 Xiaohu Xu 939 Huawei 941 Email: xuxiaohu@huawei.com 943 Wim Hendrickx 944 Nokia 946 Email: wim.henderickx@nokia.com 948 Gunter Van De Velde 949 Nokia 951 Email: gunter.van_de_velde@nokia.com 953 Acee Lindem 954 Cisco 956 Email: acee@cisco.com 958 13. IANA Considerations 960 This memo includes no request to IANA. Note to RFC Editor: Remove 961 this section before publication. 963 14. Security Considerations 965 This document does not introduce any new security considerations 966 beyond those already listed in [RFC6790]. 968 15. References 970 15.1. Normative References 972 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 973 Requirement Levels", BCP 14, RFC 2119, 974 DOI 10.17487/RFC2119, March 1997, 975 . 977 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 978 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 979 RFC 6790, DOI 10.17487/RFC6790, November 2012, 980 . 982 [RFC7855] Previdi, S., Ed., Filsfils, C., Ed., Decraene, B., 983 Litkowski, S., Horneffer, M., and R. Shakir, "Source 984 Packet Routing in Networking (SPRING) Problem Statement 985 and Requirements", RFC 7855, DOI 10.17487/RFC7855, May 986 2016, . 988 [I-D.ietf-spring-segment-routing] 989 Filsfils, C., Previdi, S., Decraene, B., Litkowski, S., 990 and R. Shakir, "Segment Routing Architecture", draft-ietf- 991 spring-segment-routing-12 (work in progress), June 2017. 993 [I-D.ietf-spring-segment-routing-mpls] 994 Filsfils, C., Previdi, S., Bashandy, A., Decraene, B., 995 Litkowski, S., and R. Shakir, "Segment Routing with MPLS 996 data plane", draft-ietf-spring-segment-routing-mpls-10 997 (work in progress), June 2017. 999 15.2. Informative References 1001 [RFC4206] Kompella, K. and Y. Rekhter, "Label Switched Paths (LSP) 1002 Hierarchy with Generalized Multi-Protocol Label Switching 1003 (GMPLS) Traffic Engineering (TE)", RFC 4206, 1004 DOI 10.17487/RFC4206, October 2005, 1005 . 1007 [RFC7325] Villamizar, C., Ed., Kompella, K., Amante, S., Malis, A., 1008 and C. Pignataro, "MPLS Forwarding Compliance and 1009 Performance Requirements", RFC 7325, DOI 10.17487/RFC7325, 1010 August 2014, . 1012 [I-D.ietf-isis-mpls-elc] 1013 Xu, X., Kini, S., Sivabalan, S., Filsfils, C., and S. 1014 Litkowski, "Signaling Entropy Label Capability Using IS- 1015 IS", draft-ietf-isis-mpls-elc-02 (work in progress), 1016 October 2016. 1018 [I-D.ietf-ospf-mpls-elc] 1019 Xu, X., Kini, S., Sivabalan, S., Filsfils, C., and S. 1020 Litkowski, "Signaling Entropy Label Capability Using 1021 OSPF", draft-ietf-ospf-mpls-elc-04 (work in progress), 1022 November 2016. 1024 [I-D.ietf-isis-l2bundles] 1025 Ginsberg, L., Bashandy, A., Filsfils, C., Nanduri, M., and 1026 E. Aries, "Advertising L2 Bundle Member Link Attributes in 1027 IS-IS", draft-ietf-isis-l2bundles-07 (work in progress), 1028 May 2017. 1030 Authors' Addresses 1032 Sriganesh Kini 1034 EMail: sriganeshkini@gmail.com 1036 Kireeti Kompella 1037 Juniper 1039 EMail: kireeti@juniper.net 1041 Siva Sivabalan 1042 Cisco 1044 EMail: msiva@cisco.com 1046 Stephane Litkowski 1047 Orange 1049 EMail: stephane.litkowski@orange.com 1051 Rob Shakir 1052 Google 1054 EMail: rjs@rob.sh 1056 Jeff Tantsura 1058 EMail: jefftant@gmail.com