Network Working Group K. Kompella Internet-Draft J. Drake Updates: 3031 (if approved) Juniper Networks Intended status: Standards Track S. Amante Expires:May 3,November 8, 2012 Level 3 Communications, LLC W. Henderickx Alcatel-Lucent L. Yong Huawei USAOctober 31, 2011May 7, 2012 The Use of Entropy Labels in MPLS Forwardingdraft-ietf-mpls-entropy-label-01draft-ietf-mpls-entropy-label-02 Abstract Load balancing is a powerful tool for engineering traffic across a network. This memo suggests ways of improving load balancing across MPLS networks using the concept of "entropy labels". It defines the concept, describes why entropy labels are useful, enumerates properties of entropy labels that allow maximal benefit, and shows how they can be signaled and used for various applications. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire onMay 3,November 8, 2012. Copyright Notice Copyright (c)20112012 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Conventions used . . . . . . . . . . . . . . . . . . . . . 4 1.2. Motivation . . . . . . . . . . . . . . . . . . . . . . . .56 2. Approaches . . . . . . . . . . . . . . . . . . . . . . . . . .67 3. Entropy Labels and Their Structure . . . . . . . . . . . . . .78 4. Data Plane Processing of Entropy Labels . . . . . . . . . . .89 4.1.IngressEgress LSR . . . . . . . . . . . . . . . . . . . . . . .8. 9 4.2.TransitIngress LSR . . . . . . . . . . . . . . . . . . . . . . .910 4.3.EgressTransit LSR . . . . . . . . . . . . . . . . . . . . . . . 11 4.4. Penultimate Hop LSR . . .9. . . . . . . . . . . . . . . . 11 5. Signaling for Entropy Labels . . . . . . . . . . . . . . . . .1011 5.1. LDP Signaling . . . . . . . . . . . . . . . . . . . . . .1012 5.2. BGP Signaling . . . . . . . . . . . . . . . . . . . . . .1112 5.3. RSVP-TE Signaling . . . . . . . . . . . . . . . . . . . .1213 6. Operations, Administration, and Maintenance (OAM) and Entropy Labels . . . . . . . . . . . . . . . . . . . . . . . . 13 7. MPLS-TP and Entropy Labels . . . . . . . . . . . . . . . . . . 14 8. Point-to-Multipoint LSPs and Entropy Labels . . . . . . . . . 15 9. Entropy Labelsand Applications . .in Various Scenarios . . . . . . . . . . . . . 15 9.1.Tunnels .LDP Tunnel . . . . . . . . . . . . . . . . . . . . . . . .1516 9.2. LDPPseudowiresOver RSVP-TE . . . . . . . . . . . . . . . . . . . . .1718 9.3.BGPMPLS Applications . . . . . . . . . . . . . . . . . . . ..189.3.1. Inter-AS BGP VPNs . . . . . . . . . . . . . . . . . . 19 9.4. Multiple Applications . . . . . . . . . . . . . . . . . . 2010. Security Considerations . . . . . . . . . . . . . . . . . . .2118 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .2219 11.1.LDP EntropyReserved LabelTLVfor ELI . . . . . . . . . . . . . . . . . .2219 11.2.BGPLDP Entropy LabelAttribute . .Capability TLV . . . . . . . . . . . . .2219 11.3. BGP Entropy Label Capability AttributeFlags for LSP_Attributes Object. . . . . . . .22. . 19 11.4.Attributes TLV for LSP_Attributes ObjectRSVP-TE Entropy Label Capability flag . . . . . . . . .22. 19 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .2320 13. References . . . . . . . . . . . . . . . . . . . . . . . . . .2320 13.1. Normative References . . . . . . . . . . . . . . . . . . .2320 13.2. Informative References . . . . . . . . . . . . . . . . . .2320 Appendix A. Applicability of LDP Entropy Labelsub-TLV . . .Capability TLV . .2421 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .2522 1. Introduction Load balancing, or multi-pathing, is an attempt to balance traffic across a network by allowing the traffic to use multiple paths. Load balancing has several benefits: it eases capacity planning; it can help absorb traffic surges by spreading them across multiple paths; it allows better resilience by offering alternate paths in the event of a link or node failure. As providers scale their networks, they use several techniques to achieve greater bandwidth between nodes. Two widely used techniques are: Link Aggregation Group (LAG) and Equal-Cost Multi-Path (ECMP). LAG is used to bond together several physical circuits between two adjacent nodes so they appear to higher-layer protocols as a single, higher bandwidth 'virtual' pipe. ECMP is used between two nodes separated by one or more hops, to allow load balancing over several shortest paths in the network. This is typically obtained by arranging IGP metrics such that there are several equal cost paths between source-destination pairs. Both of these techniques may, and often do, co-exist in various parts of a given provider's network, depending on various choices made by the provider. A very important requirement when load balancing is that packets belonging to a given 'flow' must be mapped to the same path, i.e., the same exact sequence of links across the network. This is to avoid jitter, latency and re-ordering issues for the flow. What constitutes a flow varies considerably. A common example of a flow is a TCP session. Other examples are an L2TP session corresponding to a given broadband user, or traffic within an ATM virtual circuit. To meet this requirement, a node uses certain fields, termed 'keys', within a packet's header as input to a load balancing function (typically a hash function) that selects the path for all packets in a given flow. The keys chosen for the load balancing function depend on the packet type; a typical set (for IP packets) is the IP source and destination addresses, the protocol type, and (for TCP and UDP traffic) the source and destination port numbers. An overly conservative choice of fields may lead to many flows mapping to the same hash value (and consequently poorer load balancing); an overly aggressive choice may map a flow to multiple values, potentially violating the above requirement. For MPLS networks, most of the same principles (and benefits) apply. However, finding useful keys in a packet for the purpose of load balancing can be more of a challenge. In many cases, MPLS encapsulation may require fairly deep inspection of packets to find these keys at transit LSRs. One way to eliminate the need for this deep inspection is to have the ingress LSR of an MPLS Label Switched Path extract the appropriate keys from a given packet, input them to its load balancing function, and place the result in an additional label, termed the 'entropy label', as part of the MPLS label stack it pushes onto that packet. The packet's MPLS entire label stack can then be used by transit LSRs to perform load balancing, as the entropy label introduces the right level of "entropy" into the label stack. There arefourfive key reasons why this is beneficial: 1. at the ingress LSR, MPLS encapsulation hasn't yet occurred, so deep inspection is not necessary; 2. the ingress LSR has more context and information about incoming packets than transit LSRs; 3. ingress LSRs usually operate at lower bandwidths than transit LSRs, allowing them to do more work perpacket, andpacket; 4. transit LSRs do not need to perform deep packet inspection and can load balance effectively using only a packet's MPLS labelstack.stack; and 5. transit LSRs, not having the full context that an ingress LSR does, have the hard choice between potentially misinterpreting fields in a packet as valid keys for load balancing (causing packet ordering problems) or adopting a conservative approach (giving rise to sub-optimal load balancing). Entropy labels relieves them of making this choice. This memo describes why entropy labels are needed and defines the properties of entropy labels; in particular how they are generated and received, and the expected behavior of transit LSRs. Finally, it describes in general how signaling works and what needs to be signaled, as well as specifics for the signaling of entropy labels for LDP ([RFC5036]), BGP([RFC3107], [RFC4364]),([RFC3107]), and RSVP-TE ([RFC3209]). 1.1. Conventions used The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. The following acronyms are used:LSR:BoS: Bottom of Stack CE: Customer Edge device ECMP: Equal Cost Multi-Path EL: Entropy LabelSwitching Router;ELC: Entropy Label Capability ELI: Entropy Label Indicator FEC: Forwarding Equivalence Class LAG: Link Aggregation Group LER: Label EdgeRouter;Router LSR: Label Switching Router PE: Provider Edgerouter; CE: Customer Edge device; and FEC: Forwarding Equivalence Class.Router PHP: Penultimate Hop Popping TC: Traffic Class TTL: Time-to-Live UHP: Ultimate Hop Popping VPLS: Virtual Private LAN (Local Area Network) Service VPN: Virtual Private Network The term ingress (or egress) LSR is used interchangeably with ingress (or egress) LER. The term application throughout the text refers to an MPLS application (such as a VPN or VPLS). A label stack (say of three labels) is denoted by <L1, L2, L3>, where L1 is the "outermost" label and L3 the innermost (closest to the payload). Packet flows are depicted left to right, and signaling is shown right to left (unless otherwise indicated). The term 'label' is used both for the entire 32-bit label and the 20- bit label field within a label. It should be clear from the context which is meant. 1.2. Motivation MPLS is very successful generic forwarding substrate that transports several dozen types of protocols, most notably: IP, PWE3, VPLS and IP VPNs. Within each type of protocol, there typically exist several variants, each with a different set of load balancing keys, e.g., for IP: IPv4, IPv6, IPv6 in IPv4, etc.; for PWE3: Ethernet, ATM, Frame- Relay, etc. There are also several different types of Ethernet over PW encapsulation, ATM over PW encapsulation, etc. as well. Finally, given the popularity of MPLS, it is likely that it will continue to be extended to transport new protocols. Currently, each transit LSR along the path of a given LSP has to try to infer the underlying protocol within an MPLS packet in order to extract appropriate keys for load balancing. Unfortunately, if the transit LSR is unable to infer the MPLS packet's protocol (as is often the case), it will typically use the topmost (or all) MPLS labels in the label stack as keys for the load balancing function. The result may be an extremely inequitable distribution of traffic across equal-cost paths exiting that LSR. This is because MPLS labels are generally fairly coarse-grained forwarding labels that typically describe a next-hop, or provide some of demultiplexing and/or forwarding function, and do not describe the packet's underlying protocol. On the other hand, an ingress LSR (e.g., a PE router) has detailed knowledge of an packet's contents, typically through a priori configuration of the encapsulation(s) that are expected at a given PE-CE interface, (e.g., IPv4, IPv6, VPLS, etc.). They also have more flexible forwarding hardware. PE routers need this information and these capabilities to: a) apply the required services for the CE; b) discern the packet's CoS forwarding treatment; c) apply filters to forward or block traffic to/from the CE; d) to forward routing/control traffic to an onboard management processor; and, e) load-balance the traffic on its uplinks to transit LSRs (e.g., P routers). By knowing the expected encapsulation types, an ingress LSR router can apply a more specific set of payload parsing routines to extract the keys appropriate for a given protocol. This allows for significantly improved accuracy in determining the appropriate load balancing behavior for each protocol. If the ingress LSR were to capture the flow information so gathered in a convenient form for downstream transit LSRs, transit LSRs could remain completely oblivious to the contents of each MPLS packet, and use only the captured flow information to perform load balancing. In particular, there will be no reason to duplicate an ingress LSR's complex packet/payload parsing functionality in a transit LSR. This will result in less complex transit LSRs, enabling them to more easily scale to higher forwarding rates, larger port density, lower power consumption, etc. The idea in this memo is to capture this flow information as a label, the so-called entropy label. Ingress LSRs can also adapt more readily to new protocols and extract the appropriate keys to use for load balancing packets of those protocols. This means that deploying new protocols or services in edge devices requires fewerconcommitantconcomitant changes in the core, resulting in higher edge service velocity and at the same time more stable core networks. 2. Approaches There are two main approaches to encoding load balancing information in the label stack. The first allocates multiple labels for a particular ForwardingEquivalanceEquivalence Class (FEC). These labels are equivalent in terms of forwarding semantics, but having multiple labels allows flexibility in assigning labels to flows belonging to the same FEC. This approach has the advantage that the label stack has the same depth whether or not one uses label-based load balancing; and so, consequently, there is no change to forwarding operations on transit and egress LSRs. However, it has a major drawback in that there is a significant increase in both signaling and forwarding state. The other approach encodes the load balancing information as an additional label in the label stack, thus increasing the depth of the label stack by one. With this approach, there is minimal change to signaling state for a FEC; also, there is no change in forwarding operations in transit LSRs, and no increase of forwarding state in any LSR. The only purpose of the additional label is to increase the entropy in the label stack, so this is called an "entropy label". This memo focuses solely on this approach. This latter approach uses upstream generated entropy labels, which may conflict with downstream allocated application labels. There are a few approaches to deal with this: 1) allocate a pair of labels for each FEC, one that must have an entropy label below it, and one that must not; 2) use a label (the "Entropy Label Indicator") to indicate that the next label is an entropy label; and 3) allow entropy labels only where there is no possible confusion. The first doubles control and data plane state in the network; the last is too restrictive. The approach taken here is the second. In making both the above choices, the trade-off is to increase label stack depth rather than control and data plane state in the network. Finally, one may choose to associate ELs with MPLS tunnels (LSPs), or with MPLS applications (e.g., VPNs). (What this entails is described in later sections.) We take the former approach, for the following reasons: 1. There are a small number of tunneling protocols for MPLS, but a large and growing number of applications. Defining ELs on a tunnel basis means simpler standards, lower development, interoperability and testing efforts. 2. As a consequence, there will be much less churn in the network as new applications (services) are defined and deployed. 3. Processing application labels in the data plane is more complex than processing tunnel labels. Thus, it is preferable to burden the latter rather than the former with EL processing. 4. Associating ELs with tunnels makes it simpler to deal with hierarchy, be it LDP-over-RSVP-TE or Carrier's Carrier VPNs. Each layer in the hierarchy can choose independently whether or not they want ELs. The cost of this approach is that ELIs will be mandatory; again, the trade-off is the size of the label stack. To summarize, the net increase in the label stack to use entropy labels is two: one reserved label for the ELI, and the entropy label itself. 3. Entropy Labels and Their Structure An entropy label (as used here) is a label: 1. that is not used for forwarding; 2. that is not signaled; and 3. whose only purpose in the label stack is to provide 'entropy' to improve load balancing. Entropy labels are generated by an ingress LSR, based entirely on load balancing information. However, they MUST NOT have values in the reserved label space(0-15).(0-15) [IANA MPLS Label Values]. To ensure that they are not used inadvertently for forwarding, entropy labels SHOULD have a TTL of 0. The CoS field of an entropy label can be set to any value deemed appropriate. Since entropy labels are generated by an ingress LSR, an egress LSR MUST be able totelldistinguish unambiguously between entropy labels and application labels. This is accomplished by REQUIRING thata giventhe labelisimmediately preceding an entropylabel. If any ambiguity is possible, thelabelabove(EL) in theentropyMPLS labelMUSTstack be an 'entropy label indicator'(ELI), which indicates that the following Label is an entropy label. An(ELI). The ELI istypically signaled by an egress LSR and is added to the MPLSa reserved labelstack alongwithan entropy labelvalue (TBD byan ingress LSR. For many applications, the use of entropy labels is unambiguous, and an ELI is not needed.IANA). An ELI MUST have 'Bottom of Stack'(S)(BoS) bit = 0 ([RFC3032]). The TTL SHOULD be set to whatever value the label above it in the stack has. The CoS field can be set to any value deemed appropriate; typically, this will be the value in the label aboveitthe ELI in the label stack.Applications for MPLS entropyEntropy labelsincludeare useful for pseudowires([RFC4447]), Layer 3 VPNs ([RFC4364]), VPLS ([RFC4761], [RFC4762]) and Tunnel LSPs carrying, say, IP traffic.([RFC4447]). [I-D.ietf-pwe3-fat-pw] explains how entropy labels can be used for RFC 4447-style pseudowires, and thus is complementary to this memo, which focuses onseveral other applications ofhow entropylabels.labels can be used for tunnels, and thus for all other MPLS applications. 4. Data Plane Processing of Entropy Labels 4.1.IngressEgress LSR Supposethategress LSR Y is capable of processing entropy labels for a tunnel. Y indicates this to all ingresses via signaling (see Section 5). Y MUST be prepared to deal both with packets with an imposed EL and those without; the ELI will distinguish these cases. If a particularapplication (or service or FEC),ingress chooses not to impose an EL, Y's processing of the received label stack (which might be empty) is as if Y chose not to accept ELs. If an ingressLSRXischooses topushimpose an EL, then Y will receive a tunnel termination packet with label stack <TL,AL>, whereELI, EL> <remaining packet header>. Y recognizes TLisas the'tunnel label' and AL islabel it distributed to its upstreams for the'application label'.tunnel, and pops it. (Note that TL may be theuse ofimplicit null label, in which case it doesn't appear in theconvention forlabelstacks described in Section 1.1. The usestack.) Y then recognizes the ELI and pops two labels: the ELI and the EL. Y then processes the remaining packet header as normal; this may require further processing ofa two-label stack is just for illustrative purposes.) Suppose furthermore thattunnel termination, perhaps with further ELI+EL pairs. When processing theegress LSRfinal tunnel termination, Yhas told XMAY enqueue the packet based on thatit is capabletunnel TL's or ELI's TC value, and MAY use the tunnel TL's or ELI's TTL to compute the TTL ofprocessing entropythe remaining packet header. The EL's TTL MUST be ignored. If any ELI processed by Y has BoS bit set, Y MUST discard the packet, and MAY log an error. The EL's BoS bit will indicate whether or not there are more labelsfor this application.in the stack. 4.2. Ingress LSR IfX cannot insert entropy labels,an egress LSR Y indicates via signaling that itsimply usescan process ELs on alabel stack of <TL, AL> for this application. Ifparticular tunnel, an ingress LSR X can choose whether or not to insertentropy labels, it does the followingELs for packets going into that tunnel. Y MUST handle both cases. The steps that X performs to insert ELs are as follows: 1. On an incomingpacket: 1. X identifiespacket, identify the application to which the packet belongs,identifiesand thereby pick the fields to input to the load balancing function; call the output LB. 2. Determine the application label AL (if any). Push <AL> onto the packet. 3. Based on the application, the load balancing output LB and other factors, determine the egress LSRasY, the tunnel to Y, the specific interface to the next hop, andthereby picksthus theoutgoingtunnel labelstack <TL, AL>TL. Use LB to generate the entropy label EL. 4. If, for the chosen tunnel, Y has not indicated that it can process ELs, push <TL> onto thepacket to send to Y. 2. X determines which keyspacket. If Y has indicated that itwill use for load balancing. 3. X, having kept state that Ycan processentropy labelsELs forthis application, generates an entropy label EL (based on the output oftheload balancing function). 4. If Y does not need an ELI, X pushestunnel, push <TL,AL,ELI, EL> onto thepacket before forwardingpacket. X SHOULD put the same TTL and TC fields for the ELI as ittodoes for TL. The TTL for thenext hop to Y.EL MUST be zero. The TC for the EL may be any value. 5.If Y requires an ELI,Xpushes <TL, AL, E, EL> onto the packet before forwarding itthen determines whether further tunnel hierarchy is needed; if so, X goes back to step 3, possibly with a new egress Y for thenext hop to Y, where Enew tunnel. Otherwise, X isa label whose 20-bit label fielddone, and sends out the packet. Notes: a. X computes load balancing information and generates the EL based on the incoming application packet, even though the signaling of EL capability is associated with tunnels. b. X MAY insert several entropy labels in theELIstack (each, of course, preceded by an ELI), potentially one for each hierarchical tunnel, provided thatY signaled, and whose other fields are set as per Section 3. Notethe egress for thatingress LSRtunnel has indicated that it can process ELs for that tunnel. c. X MUST NOT include an entropy label for a given tunnel unless the egress LSR Yfor this applicationhas indicated that itis ready to receive entropy labels. Furthermore, if Y has signaled that an ELI is needed, then X MUST include the ELI before thecan process entropylabel. Notelabels for thatthetunnel. d. The signaling and use of entropy labels in one direction (signaling from Y to X, and data path from X to Y)has no bearing onis completely independent of thebehaviorsignaling and use of entropy labels in theoppositereverse direction (signaling from X to Y, and data path from Y to X).4.2.4.3. Transit LSR Transit LSRshave virtuallyMAY operate with no change in forwarding behavior.ForThe following are suggestions for optimizations that improve load balancing, reduce the amount of packet data processed, and/or enhance backward compatibility. If a transitLSRsLSR recognizes the ELI, it MAY choose to load balance solely on the following label (the EL); otherwise, it SHOULD use as much of the whole label stack as feasible as keys for the load balancingfunction. Transit LSRs MUST NOT includefunction, with the exception that reserved labelsas input to its load balancing function. TransitMUST NOT be used. Some transit LSRsMAY choose tolook beyond the label stack forfurther keys; however, if entropy labels are being used, this may not be very useful. Looking beyond the label stack may be the simplestbetter load balancing information. This is a simple, backward compatible approach inan environmentnetworks where some ingress LSRsuse entropy labelsimpose ELs and othersdon't, or for backward compatibility. Thus, other than using the full label stack as input to the load balancing function, transit LSRs are almost unaffected by the use of entropy labels. 4.3. Egress LSR Suppose egress LSR Y signals that itdon't. However, this iscapable of processing entropy labels for a tunnel or an application with label L. There are three casesofinterest: (a) L is the implicit NULL label, in which case an ELI is mandatory; (b) L is not the implicit NULL label and an ELI is not required (L's S bit will be used to determine whether or not there is an EL); and (c) L is not the implicit NULL label but an ELI is required. a1) Y receiveslimited incremental value if anunlabeled packet. There is obviously no EL; Y processes the packet as usual. a2) Y receives a packet whose top label is the ELI. Y processes the TTL and CoS fields of the ELI label, ensures that the S bitEL is0, then pops it,indeed present, andpops the next label as well (which must be the EL), then pops it. Y processes the remaining payload as usual. b) Y receives arequires more packetwith top label L, and an ELI is not required. Y processes L as usual; if L's S bit is 1, the label stack is done. If L's S bit is 0, the following label is the EL. Y pops the EL. Y processesprocessing from thepayload as usual. c) Y receives a packet with top label L. Y processes L as usual; if L's S bit is 1,LSR. A transit LSR MAY choose to parse the label stackis done. If L's S bit is 0, Y checks the following label. If it is the ELI label, Y processesfor theTTL and CoS fieldspresence of the ELI,ensures that the S bit is 0, pops the ELI labeland look beyond thefollowinglabel(which is the EL), and processesstack only if it does not find it, thus retaining theremaining payload as usual. If there is an ELI with S bit = 1, thereold behavior when needed, yet avoided unnecessary work if not. 4.4. Penultimate Hop LSR No change isan error in the label stack. Note that the TTL field of the EL (if present) will be 0; Y MUST NOT react to this.needed at penultimate hop LSRs. 5. Signaling for Entropy Labels An egress LSR Ymaycan signal to ingress LSR(s) its ability to process entropy labels (henceforth called "Entropy Label Capability" or ELC) on aper-application (or per-FEC) basis. As part of this signaling, Y also signals the ELI to use, if any. In cases where an application label is used and must be the bottommost label in the label stack, Y MAY signal that no ELI is needed forgiven tunnel. Note thatapplication. In cases where no application label exists, or where the application labelEntropy Label Capability maynotbethe bottommost label in the label stack,asymmetric: if LSRs X and YMUST signal a valid ELI to be used in conjunction with the entropy label for this FEC. In this case, an ingress LSR will either not add an entropy label, or push the ELI before the entropy label. This makes the use or non-use of an entropy label by the ingress LSR unambiguous. Valid ELI label valuesarestrictly greater than 15. It should be noted that egress LSR Y may use the same ELI value for all applications for which an ELI is needed. The ELI MUST beat opposite ends of alabel that does not conflict with any other labels that Y has advertised to other LSRs for other applications. Furthermore, it should be noted that the ability to process entropy labels (and the corresponding ELI) may be asymmetric: an LSRtunnel, X may bewillingable to process entropy labels, whereasLSRY maynot be willing to process entropy labels.not. The signaling extensions below allow for this asymmetry. For an illustration of signaling and forwarding with entropy labels, seeFigureSection 9. 5.1. LDP SignalingWhen usingA new LDPfor signaling tunnel labels ([RFC5036]), a Label Mapping Message sub-TLV (Entropy Label sub-TLV)TLV ([RFC5036]) isuseddefined to signal anegress LSR'segress's ability to process entropy labels. This is called the ELC TLV, and may appear as an Optional Parameter of the Label Mapping Message TLV. The presence of theEntropy Label sub-TLVELC TLV inthea Label Mapping Message indicates to ingress LSRs that the egress LSR can processanentropylabel. In addition, the Entropy Label sub-TLV contains a label value for the ELI. If the ELI is zero, this indicates the egress doesn't need an ELIlabels for thesignaled application; if not, the egress requires the given ELI with entropy labels. An example where an ELI is needed is when the signaled application is an LSP that can carry IP traffic.associated LDP tunnel. The ELC TLV has Type (TBD by IANA) and Length 0. The structure of theEntropy Label sub-TLVELC TLV is shown below. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |U|F| Type (TBD) | Length(8) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | Must Be Zero(0) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: Entropy Labelsub-TLVCapability TLV where: U: Unknown bit. This bit MUST be set to 1. If the Entropy Labelsub-TLVCapability TLV is not understood, then the TLV is not known to the receiver and MUST be ignored. F: Forward bit. This bit MUST be set be set to 1. Since thissub-TLVCapability TLV is going to be propagated hop-by-hop, thesub-TLVTLV should be forwarded even by nodes that may not understand it. Type:sub-TLVTypefield, as specifiedfield. To be assigned by IANA. Length:sub-TLVLength field. This field specifies the total length in octets of theEntropy Label sub-TLV. Value: value of the Entropy Label Indicator Label.ELC TLV, and is currently defined to be 0. 5.2. BGP Signaling When BGP [RFC4271] is used for distributing Network Layer Reachability Information (NLRI) as described in, for example, [RFC3107],[RFC4364] and [RFC4761],the BGP UPDATE message may include theEntropy Label attribute.ELC attribute as part of the Path Attributes. This is an optional, transitive BGP attribute of typeTBD.(to be assigned by IANA). The inclusion of this attribute with an NLRI indicates that the advertising BGP router can process entropy labels as an egress LSR for all routes in that NLRI.If the attribute length is less than three octets, this indicates that the egress doesn't need an ELI for the signaled application. If the attribute length is at least three octets, the first three octets encode an ELI label value as the high order 20 bits; the egress requires this ELI with entropy labels. An example where an ELI is needed is when the NLRI contains unlabeled IP prefixes.A BGP speaker S that originates an UPDATE shouldonlyinclude theEntropy LabelELC attribute only if both of the following are true: A1: S sets the BGP NEXT_HOP attribute to itself; AND A2: S can process entropylabels for the given application. If both A1 and A2 are true, and S needs an ELI to recognize entropy labels, then S MUST include the ELI label value as part of the Entropy Label attribute. An UPDATE SHOULD contain at most one Entropy Label attribute.labels. Suppose a BGP speaker T receives an UPDATE U with theEntropy Label attribute ELA.ELC attribute. T has two choices. T can simply re-advertise U with thesame ELAELC attribute if either of the following is true: B1: T does not change the NEXT_HOP attribute; OR B2: T simply swaps labels without popping the entire label stack and processing the payload below. An example of the use of B1 is RouteReflectors; an example of the use of B2 is illustrated in Section 9.3.1.2.Reflectors. However, if T changes the NEXT_HOP attribute for U and in the data plane pops the entire label stack to process the payload, TMUST remove ELA. TMAY includea new Entropy Labelan ELC attributeELA'for UPDATE U' if both of the following are true: C1: T sets the NEXT_HOP attribute of U' to itself; AND C2: T can process entropylabels for the given application. Again, if both C1 and C2 are true, and T needs an ELI to recognize entropy labels, thenlabels. Otherwise, T MUSTinclude the ELI label value as part ofremove theEntropy LabelELC attribute. 5.3. RSVP-TE Signaling Entropy Label support is signaled in RSVP-TE [RFC3209] usinganthe Entropy Label Capability (ELC) flag in the Attribute Flags TLV(Type TBD)of the LSP_ATTRIBUTES object [RFC5420]. The presence ofthis attribute indicates thatthesignaler (the egressELC flag inthe downstream direction using Resv messages;a Path message indicates that the ingressin the upstream direction using Path messages)can process entropylabels. The Entropy Label Attribute contains a value for the ELI. Iflabels in theELI is zero,upstream direction; thisindicates that the signaler doesn't need an ELIonly makes sense forthis application; if not, then the signaler requires the given ELI with entropy labels. An example where an ELI is needed is when the signaleda bidirectional LSPcan carry IP traffic.and MUST be ignored otherwise. Theformatpresence of theEntropy Label Attribute is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Entropy Label Attribute | Length (4) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ELI Label | MBZ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ An egress LSR includes the Entropy Label AttributeELC flag in a Resv messageto indicateindicates thatitthe egress can process entropy labels in the downstreamdirection of the signaled LSP. An ingress LSR includes the Entropy Label Attribute in a Path messagedirection. The bit number fora bi-directional LSP to indicate that it can process entropy labels intheupstream direction of the signaled LSP. If the signaled LSPELC flag isnot bidirectional, the Entropy Label Attribute SHOULD NOT be included in the Path message, and egress LSR(s) SHOULD ignore the attribute, if any. As described in Section 8, there is also the needtodistribute an ELI from the ingress (upstream label allocation). In the case of RSVP-TE, this is accomplished using the Upstream ELI Attribute TLV of the LSP_ATTRIBUTES object, as shown below: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Upstream ELI Attribute | Length (4) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ELI Label | MBZ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+be assigned by IANA. 6. Operations, Administration, and Maintenance (OAM) and Entropy Labels Generally OAM comprises a set of functions operating in the data plane to allow a network operator to monitor its network infrastructure and to implement mechanisms in order to enhance the general behavior and the level of performance of its network, e.g., the efficient and automatic detection, localization, diagnosis and handling of defects. Currently defined OAM mechanisms for MPLS include LSP Ping/Traceroute [RFC4379] and Bidirectional Failure Detection (BFD) for MPLS [RFC5884]. The latter provides connectivity verification between the endpoints of an LSP, and recommends establishing a separate BFD session for every path between the endpoints. The LSP traceroute procedures of [RFC4379] allow an ingress LSR to obtain label ranges that can be used to send packets on every path to the egress LSR. It works by having ingress LSR sequentially ask the transit LSRs along a particular path to a given egress LSR to return a label range such that the inclusion of a label in that range in a packet will cause the replying transit LSR to send that packet out the egress interface for that path. The ingress provides the label range returned by transit LSR N to transit LSR N + 1, which returns a label range which is less than or equal in span to the range provided to it. This process iterates until the penultimate transit LSR replies to the ingress LSR with a label range that is acceptable to it and to all LSRs along path preceding it for forwarding a packet along the path. However, the LSP traceroute procedures do not specify where in the label stack the value from the label range is to be placed, whether deep packet inspection is allowed and if so, which keys and key values are to be used. This memo updates LSP traceroute by specifying that the value from the label range is to be placed in the entropy label. Deep packet inspection is thus not necessary, although an LSR may use it, provided it do so consistently, i.e., if the label range to go to a given downstream LSR is computed with deep packet inspection, then the data path should use the same approach and the same keys. In order to have a BFD session on a given path, a value from the label range for that path should be used as the EL value for BFD packets sent on that path.As part of the MPLS-TP work, an in-band OAM channel is defined in [RFC5586]. Packets sent in this channel are identified with a reserved label, the Generic Associated Channel Label (GAL) placed at the bottom of the MPLS label stack. In order to use the inband OAM channel with entropy labels, this memo relaxes the restriction that the GAL must be at the bottom of the MPLS label stack. Rather, the GAL is placed in the MPLS label stack above the entropy label so that it effectively functions as an application label.7. MPLS-TP and Entropy Labels Since MPLS-TP does not use ECMP, entropy labels are not applicable to an MPLS-TP deployment. 8. Point-to-Multipoint LSPs and Entropy Labels Point-to-Multipoint (P2MP) LSPs [RFC4875] typically do not use ECMP for load balancing, as the combination of replication and multipathing can lead to duplicate traffic delivery. However, P2MP LSPs can traverseBundled Linksbundled links [RFC4201] and LAGs. In both these cases, load balancing is useful, and hence entropy labels can be ofsomevalue for P2MP LSPs. Thereare twois a potentialcomplicationscomplication with the use of entropy labels in the context of P2MP LSPs,botha consequence of the fact that the entire label stack below the P2MP label must be the same for all egress LSRs.First,This is that all egress LSRs must be willing to receive entropy labels; if even one egress LSR is not willing, then entropy labels MUST NOT be used for this P2MP LSP.Second, if an ELI is required, all egress LSRs must agree to the same value of ELI. This can be achieved by upstream allocation of the ELI; in particular, for RSVP-TE P2MP LSPs, the ingress LSR distributes the ELI value using the Upstream ELI Attribute TLV of the LSP_ATTRIBUTES object, defined in Section 5.3. With regard to the first issue,In this regard, the ingress LSR MUST keep track of the ability of each egress LSR to process entropy labels, especially since the set of egress LSRs of a given P2MP LSP may change over time. Whenever an existing egress LSR leaves, or a new egress LSR joins the P2MP LSP, the ingress MUST re-evaluate whether or not to include entropy labels for the P2MP LSP. In some cases, it may be feasible to deploy two P2MP LSPs, one toentropy label capableELC egress LSRs, and the other to the remaining non-ELC egress LSRs. However, this requires more state in the network, more bandwidth, and more operational overhead (trackingEL-capableELC LSRs, and provisioning P2MP LSPs accordingly).Furthermore, this approachAlternatively, an ingress LSR maynot work for some applications (such mVPNs and VPLS) which automatically create and/or usechoose to signal two separate P2MPLSPsLSPs, one to ELC egresses, the other to non- ELC egresses, trading off implementation complexity fortheir multicast requirements.operational complexity. 9. Entropy Labelsand Applicationsin Various Scenarios This section describes theusageuse of entropy labels in variousscenarios with different applications. 9.1. Tunnels Tunnel LSPs, signaled with either LDP or RSVP-TE, typically carry other MPLS applications such as VPNs or pseudowires. This being the case, if the egress LSR of a tunnel LSP is willing to process entropy labels, it would signal the need for an Entropy Label Indicator to distinguish between entropy labels and other application labels.scenarios. In the figures below, the followingconvention isconventions used to depictinformation signaledprocessing between X and Y. Note that control plane signaling goes right to left, whereas data plane processing goes left to right. Protocols Y: <--- [L, E] Y signals L to X---------- ... ----------X ------------- Yapp: <--- [label L, ELI value]LS: <L, ELI, EL> Label stack X: +<L, ELI, EL> X pushes <L, ELI, EL> Y: -<L, ELI, EL> Y pops <L, ELI, EL> This means that Y signals to X label L forapplication app. The ELI valuean LDP tunnel. E can be one of:-: meaning entropy labels are NOT accepted;0: meaningentropy labels are accepted, no ELIegress isneeded; or E:NOT entropylabels are accepted, ELIlabelEcapable, or 1: meaning egress isrequired.entropy label capable. Thefollowing illustrates a simple intra-AS tunnel LSP. X -------- A --- ... --- B -------- Y tunnel LSP L: [TL, E] <--- ... <--- [TL0, E] IP pkt: push <TL, E, EL> ---------------> Figure 2: Tunnel LSPs and Entropy Labels Tunnel LSPs may cross Autonomous System (AS) boundaries, usually using BGP ([RFC3107]). In this case,line with LS: shows theAS Border Routers (ASBRs) MAY simply propagatelabel stack on theegress LSR's ability to process entropy labels, or they MAY declarewire. Below thatentropy labels may not be used. If an ASBR (say A2 below) chooses to propagateis theegressoperation that each LSRY's ability to process entropy labels, A2 MUST also propagate Y's choice of ELI. X ---- ... ---- A1 ------- A2 ---- ... ---- Y intra-AS LSP A2-Y: <--- [TL0, E] inter-AS LSP A1-A2: [AL, E] intra-AS LSP X-A1: <--- [TL1, E] IP pkt:does in the data plane, where + means push<TL1, E, EL> Here, ASBR A2 chooses to propagate Y's ability to process entropy labels, by "translating" Y's signaling of entropythe following labelcapability (say using LDP) to BGP; and A1 translate A2's BGP signaling to (say) RSVP-TE. The end-to-end tunnel (X to Y) will have entropy labels if X chooses to insert them. Figure 3: Inter-AS Tunnel LSPstack, - means pop the following label stack, L~L' means swap L withEntropy Labels X ---- ... ---- A1 ------- A2 ---- ... ---- Y intra-AS LSP A2-Y: <--- [TL0, E] inter-AS LSP A1-A2: [AL, E] intra-AS LSP X-A1: <--- [TL1, -] IP pkt: push <TL1> --> Here, ASBR A1 decidedL', and * means thatentropy labels are not to be used; thus,theend-to-end tunnel cannot have entropy labels, even though both X and Y may be capable of inserting and processing entropy labels. Figure 4: Inter-ASoperation is not depicted. 9.1. LDP TunnelLSPThe following illustrates several simple intra-AS LDP tunnels. The first diagram shows ultimate hop popping (UHP) with ingress inserting an EL, the second UHP with noEntropy Labels 9.2. LDP Pseudowires [I-D.ietf-pwe3-fat-pw] describesELs, thesignalingthird PHP with ELs, anduse of entropy labelsfinally, PHP with no ELs, but also with an application label AL (which could, for example, be a VPN label). Note that, in all thecontext of RFC 4447 pseudowires, so this willcases below, the MPLS application does not matter; it may bedescribed further here. [RFC4762] specifies the use of LDP for signaling VPLS pseudowires. An egress VPLS PEthatcan process entropyX pushes some more labelscan indicate this by adding the Entropy Label sub-TLV in(perhaps for a VPN or VPLS) below theLDP message it sends to other PEs. An ELI is not required. An ingress PE must maintain state per egress PE as to whether it can process entropy labels. X -------- A --- ... --- B --------ones shown, and Ytunnel LSP L: [TL, E]pops them. A: <--- [TL4, 1] B: <-- [TL3, 1] ...<---W: <-- [TL1, 1] Y: <-- [TL0,E] VPLS label: <------------------------ [VL, 0] VPLS pkt: push <TL, VL,1] X --------------- A --------- B ... W ---------- Y LS: <TL4, ELI, EL>--------------> Figure 5: Entropy Labels with LDP VPLS Note that although the underlying tunnel LSP signaling indicated the need for an<TL3,ELI,EL> <TL0,ELI,EL> X: +<TL4, ELI,VPLS packets don't need anEL> A: TL4~TL3 B: TL3~TL2 ... W: TL1~TL0 Y: -<TL0, ELI,and thus the label stack pushed by X do not have one. [RFC4762] also describes the notion of "hierarchical VPLS" (H-VPLS). In H-VPLS, 'hub PEs' remove the label stack and process VPLS packets; thus, they must make their own decisions on the use of entropy labels, independent of other hub PEs or spoke PEsEL> LDP withwhich they exchange signaling. In the example below, spoke PEs X and Y and hub PE B can process entropy labels, but hub PE A cannot. X ----UHP; ingress inserts ELs A: <--- [TL4, 1] B: <-- [TL3, 1] ...----W: <-- [TL1, 1] Y: <-- [TL0, 1] X --------------- A---- ... ------------- B----...----W ---------- Yspoke PW1: <--- [SL1, 0] hub-hub PW: <---- [HL, 0] spoke PW2: <--- [SL2, -] SPW2 pkt: push <TL1, SL2> H-H PW pkt: push <TL2,HL,EL> SPW1 pkt: push <TL3,SL1,EL> Figure 6: Entropy LabelsLS: <TL4> <TL3> <TL0> X: +<TL4> A: TL4~TL3 B: TL3~TL2 ... W: TL1~TL0 Y: -<TL0> LDP withH-VPLS 9.3. BGP Applications Section 9.1 described a BGP application for the creation of inter-AS tunnel LSPs. This section describes two other BGP applications, IP VPNs ([RFC4364]) and BGP VPLS ([RFC4761]). An egress PE for either of these applications indicates its ability to process entropy labels by adding the Entropy Label attribute to its BGP UPDATE message. Again,UHP; ingressPEs must maintain per-egress PE state regarding its ability to process entropy labels. In this section, both of these applications will be referred to as VPNs. In the intra-AS case, PEs signal application labels and entropy label capability to each other, either directly, or via Route Reflectors (RRs). If RRs are used, they mustdoes notchange the BGP NEXT_HOP attribute in the UPDATE messages; furthermore, they can simply pass on the Entropy Label attribute as is.insert ELs A: <--- [TL4, 1] B: <-- [TL3, 1] ... W: <-- [TL1, 1] Y: <-- [3, 1] X----------------------- A--- ... ------------ B--------... W ---------- Ytunnel LSP L: [TL, E] <---X: +<TL4, ELI, EL> A: TL4~TL3 B: TL3~TL2 ...<--- [TL0, E] BGP VPN label: <------------------------ [VL, 0] BGP VPN pkt: push <TL, VL,W: -TL1 Y: -<ELI, EL>--------------> Figure 7: Entropy LabelsLDP withIntra-AS BGP apps For BGP VPLS, the application label is at the bottom of stack, so no ELI is needed. For BGP IP VPNs, the application label is usually at the bottom of stack, so again no ELI is needed. However, in the case of Carrier's Carrier (CsC) VPNs, the BGP VPN label may not be at the bottom of stack. In this case, an ELI is necessary for CsC VPN packetsPHP; ingress inserts ELs A: <--- [TL4, 1] B: <-- [TL3, 1] ... W: <-- [TL1, 1] Y: <-- [3, 1] VPN: <------------------------------------------ [AL] X --------------- A --------- B ... W ---------- Y LS: <TL4, AL> <TL3, AL> <AL> X: +<TL4, AL> A: TL4~TL3 B: TL3~TL2 ... W: -TL1 Y: -<AL> LDP withentropy labels to distinguish them from nested VPN packets. In the example below, the nested VPN signaling isPHP + VPN; ingress does notshown;insert ELs 9.2. LDP Over RSVP-TE The following illustrates "LDP over RSVP-TE" tunnels. X and Y are the ingress and egressPE for(respectively) of thenested VPN (not shown) must signal whether or not it can process egress labels,LDP tunnel; A and W are the ingressnested VPN PE may insert an entropy label if so. Three cases are shown: a plain BGP VPN packet, a CsC VPN packet originating from X,anda transit nested VPN packet originating from a nested VPN ingress PE (conceptually to the leftegress ofX).the RSVP-TE tunnel. It is assumed that both thenested VPN packet arrives at X with label stack <ZL, CVL> where ZL is the tunnel label (to be swapped with <TL, CL>) and CVL is the nested VPN label. Note that Y can use the same ELI for the tunnel LSPLDP andthe CsC VPN (and any other application that needs an ELI).RSVP-TE tunnels have PHP. LDP with ELs, RSVP-TE without ELs LDP: <--- [L4, 1] <------- [L3, 1] <--- [3, 1] RSVP-TE: <-- [Rn, 0] <-- [3, 0] X----------------------- A--- ... ------------ B--------... W ---------- Ytunnel LSP L: [TL, E] <---LS: <L4, ELI, EL> <Rn,L3,ELI,EL> ...<--- [TL0, E] BGP VPN label: <------------------------ [VL, 0] BGP CsC VPN label: <------------------------ [CL, E] BGP VPN pkt: push <TL, VL,<ELI, EL>--------------> CsC VPN pkt: push <TL, CL, E,DP: +<L4, ELI, EL> L4~<Rn, L3> * -L1 -<ELI, EL>-----------> nested VPN pkt: swap <ZL> with <TL, CL> -------->Figure8: Entropy Labels with CoC VPN 9.3.1. Inter-AS BGP VPNs There are three commonly used options for inter-AS IP VPNs and BGP VPLS, known informally as "Option A", "Option B" and "Option C". This section describes how entropy labels can be used in these options. 9.3.1.1. Option A Inter-AS VPNs In option A, an ASBR pops the full label stack of a VPN packet exiting an AS, processes the payload header (IP or Ethernet), and forwards the packet natively (i.e., as IP or Ethernet, but not2: LDP over RSVP-TE Tunnels 9.3. MPLS Applications An ingress LSR X must keep state per unicast tunnel asMPLS)tothe peer ASBR. Thus, entropy label signaling and insertion are completely local to each AS. The inter-AS paths do not use entropy labels, as they do not use a label stack. 9.3.1.2. Option B Inter-AS VPNs The ASBRs in option B inter-AS VPNs have a choice (usually determined by configuration) ofwhetherto just swap labels (from withintheAS to the neighbor AS or vice versa), or to pop the full label stack and process the packet natively. This choice occurs at each ASBR in each direction. In the case of native packet processing at an ASBR, entropy label signaling and insertion is local to each AS and to the inter-AS paths (which, unlike option A, do have labeled packets). In the case of simple label swapping at an ASBR, the ASBR can propagate received entropy label signaling onward. That is, if a PE signals to its ASBR that it can process entropy labels (via an Entropy Label attribute), the ASBR can propagate that attribute to its peer ASBR; if a peer ASBR signalsegress for thatittunnel can process entropylabels, the ASBR can propagate that to all PEs within its AS). Note that this is the case even though ASBRs change the BGP NEXT_HOP attribute to "self", because of clause B2 in Section 5.2. 9.3.1.3. Option C Inter-AS VPNs In Option C inter-AS VPNs, the ASBRs are not involved in signaling; they dolabels. X does not haveVPN state; they simply swap labels of inter-AS tunnels. Signaling is PEtoPE, usually via Route Reflectors; however, if RRs are used, the RRs do not change the BGP NEXT_HOP attribute. Thus, entropy label signaling and insertion are on a PE- pair basis, and the intermediate routers, ASBRs and RRs do not play a role. 9.4. Multiple Applications It has been mentioned earlier that an ingress PE mustkeep state peregress PE with regard to its ability to process entropy labels. Anapplication running over that tunnel. However, an ingress PEmust also keep state per application, as entropy label processing must be basedcan choose onthe application context in whichapacket is received (and of course, the corresponding entropy label signaling). In the example below, an egress LSR Y signals a tunnel LSP L, and is preparedper-application basis whether or not toreceive entropy labels on L, but requiresinsert ELs. For example, X may have anELI. Furthermore, Y signals two pseudowires PW1 and PW2 with labels PL1 and PL2, respectively, and indicates that it can receive entropy labelsapplication forboth pseudowires without the need of an ELI; and finally, Y signals a L3 VPN with label VL, but Ywhich it does notindicate that it can receive entropy labels for the L3 VPN. Ingress LSR X chooseswish tosend native IP packetsuse ECMP (e.g., circuit emulation), or for which it does not know which keys toYuse for load balancing (e.g., Appletalk overL with entropy labels, thus X must include the given ELI (yieldingalabel stackpseudowire). In either of<TL, ELI, EL>).those cases, Xchoosesmay choose not toaddinsert entropylabels on PW1 packets to Y, with a label stack of <TL, PL1, EL>,labels, butchooses notmay choose todo so for PW2 packets. X must not sendinsert entropy labelson L3 VPN packets to Y, i.e., the label stack must be <TL, VL>. X -------- A --- ... --- B -------- Y tunnel LSP L: [TL, E] <--- ... <--- [TL0, E] PW1 label: <----------------------- [PL1, 0] PW2 label: <----------------------- [PL2, 0] VPN label: <----------------------- [VL, -]for an IPpkt: push <TL, ELI, EL> -------------> PW1 pkt: push <TL, PL1, EL> -------------> PW2 pkt: push <TL, PL2> ----------------->VPNpkt: push <TL, VL> ------------------> Figure 9: Entropy Labels for Multiple Applicationsover the same tunnel. 10. Security Considerations This document describes advertisement of the capability to support receipt ofentropy-labels and an Entropy Label Indicator thatentropy labels which an ingress LSR mayapply toinsert in MPLS packets in order to allow transit LSRs to attain betterload-balancingload balancing across LAG and/or ECMP paths in the network. This document does not introduce new security vulnerabilities toLDP.LDP, BGP or RSVP-TE. Please refer to the Security Considerations section ofLDP ([RFC5036])these protocols ([RFC5036], [RFC4271] and [RFC3209]) for security mechanisms applicable toLDP.each. Given that there is no end-user control over the values used for entropy labels, there is little risk of Entropy Label forgery which could cause uneven load-balancing in the network. If Entropy Label Capability is not signaled from an egress PE to an ingress PE, due to, for example, malicious configuration activity on the egress PE, then thePE'sPE will fall back to not using entropy labels for load-balancing traffic over LAG or ECMP pathswhich, in some cases,which is in general no worse than the behavior observed in current production networks. That said,operators areit is recommendedtothat operators monitor changes to PE configurations and, more importantly, the fairness of load distribution overequal-costLAG or ECMP paths. If the fairness of load distribution over a set of paths changes that could indicate a misconfiguration, bug or other non-optimal behavior on theirPE'sPEs and they should take corrective action.Given that most applications already signal an Application Label, e.g.: IPVPNs, LDP VPLS, BGP VPLS, whose Bottom of Stack bit11. IANA Considerations 11.1. Reserved Label for ELI IANA isbeing re-usedrequested tosignal entropyallocate a reserved labelcapability, there is little to no additional risk that traffic could be misdirected into an inappropriate IPVPN VRF or VPLS VSI at the egress PE. In the context of downstream-signaled entropy labels that requirefor theuse of anEntropy Label Indicator(ELI), there should be little to no additional risk because the egress PE is solely responsible for allocating an ELI value and ensuring that ELI label value DOES NOT conflict with other MPLS labels it has previously allocated. On the other hand, for upstream-signaled entropy labels, e.g.: RSVP-TE point-to-point or point-to-multipoint LSP's or Multicast LDP (mLDP) point-to-multipoint or multipoint-to-multipoint LSP's, there is a risk that the head-end MPLS LER may choose an ELI value that is already in use by a downstream LSR or LER. In this case, it is the responsibility of(ELI) from thedownstream LSR or LER to ensure that it MUST NOT accept signaling for an ELI value that conflicts with MPLS label(s) that are already in use. 11. IANA Considerations 11.1."Multiprotocol Label Switching Architecture (MPLS) Label Values" Registry. 11.2. LDP Entropy Label Capability TLV IANA is requested to allocate the next available value from the IETF Consensus range in the LDP TLV Type Name Space Registry as the "Entropy Label Capability TLV".11.2.11.3. BGP Entropy Label Capability Attribute IANA is requested to allocate the next available Path Attribute Type Code from the "BGP Path Attributes" registry as the "BGP Entropy Label Capability Attribute".11.3. Attribute Flags for LSP_Attributes Object11.4. RSVP-TE Entropy Label Capability flag IANA is requested to allocate a new bit from the "Attribute Flags" sub-registry of the "RSVP TE Parameters" registry. Bit | Name | Attribute | Attribute | RRO No | | Flags Path | Flags Resv |----+----------------------+------------+------------+---------+--------------------------+------------+------------+----- TBD Entropy LabelLSPCapability Yes Yes No11.4. Attributes TLV for LSP_Attributes Object IANA is requested to allocate the next available value from the "Attributes TLV" sub-registry of the "RSVP TE Parameters" registry.12. Acknowledgments We wish to thank Ulrich Drafz for his contributions, as well as the entire 'hash label' team for their valuable comments and discussion. Sincere thanks to Nischal Sheth for his many suggestions and comments, and his careful reading of the document, especially with regard to data plane processing of entropy labels. 13. References 13.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack Encoding", RFC 3032, January 2001. [RFC3107] Rekhter, Y. and E. Rosen, "Carrying Label Information in BGP-4", RFC 3107, May 2001. [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. [RFC5420] Farrel, A., Papadimitriou, D., Vasseur, JP., and A. Ayyangarps, "Encoding of Attributes for MPLS LSP Establishment Using Resource Reservation Protocol Traffic Engineering (RSVP-TE)", RFC 5420, February 2009. 13.2. Informative References [I-D.ietf-pwe3-fat-pw] Bryant, S., Filsfils, C., Drafz, U., Kompella, V., Regan, J., and S. Amante, "Flow Aware Transport of Pseudowires over an MPLS Packet Switched Network", draft-ietf-pwe3-fat-pw-07 (work in progress), July 2011. [RFC4201] Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling in MPLS Traffic Engineering (TE)", RFC 4201, October 2005. [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway Protocol 4 (BGP-4)", RFC 4271, January 2006. [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol Label Switched (MPLS) Data Plane Failures", RFC 4379, February 2006. [RFC4447] Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G. Heron, "Pseudowire Setup and Maintenance Using the Label Distribution Protocol (LDP)", RFC 4447, April 2006. [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, January 2007. [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, January 2007. [RFC4875] Aggarwal, R., Papadimitriou, D., and S. Yasukawa, "Extensions to Resource Reservation Protocol - Traffic Engineering (RSVP-TE) for Point-to-Multipoint TE Label Switched Paths (LSPs)", RFC 4875, May 2007. [RFC5036] Andersson, L., Minei, I., and B. Thomas, "LDP Specification", RFC 5036, October 2007. [RFC5586] Bocci, M., Vigoureux, M., and S. Bryant, "MPLS Generic Associated Channel", RFC 5586, June 2009. [RFC5884] Aggarwal, R., Kompella, K., Nadeau, T., and G. Swallow, "Bidirectional Forwarding Detection (BFD) for MPLS Label Switched Paths (LSPs)", RFC 5884, June 2010. Appendix A. Applicability of LDP Entropy Labelsub-TLVCapability TLV In the case of unlabeled IPv4 (Internet) traffic, the Best Current Practice is for an egress LSR to propagate eBGP learned routes within a SP's Autonomous System after resetting the BGP next-hop attribute to one of its Loopback IP addresses. That Loopback IP address is injected into the Service Provider's IGP and, concurrently, a label assigned to it via LDP. Thus, when an ingress LSR is performing a forwarding lookup for a BGP destination it recursively resolves the associated next-hop to a Loopback IP address and associated LDP label of the egress LSR. Thus, in the context of unlabeled IPv4 traffic, the LDP Entropy Labelsub-TLVCapability TLV will typically be applied only to the FEC for the Loopback IP address of the egress LSR and the egress LSRwillneed not announce an entropy label capability for the eBGP learned route. Authors' Addresses Kireeti Kompella Juniper Networks 1194 N. Mathilda Ave. Sunnyvale, CA 94089 US Email: kireeti@juniper.net John Drake Juniper Networks 1194 N. Mathilda Ave. Sunnyvale, CA 94089 US Email: jdrake@juniper.net Shane Amante Level 3 Communications, LLC 1025 Eldorado Blvd Broomfield, CO 80021 US Email: shane@level3.net Wim Henderickx Alcatel-Lucent Copernicuslaan 50 2018 Antwerp Belgium Email: wim.henderickx@alcatel-lucent.com Lucy Yong Huawei USA1700 Alma5340 Legacy Dr.Suite 500Plano, TX7507575024 US Email:lucyyong@huawei.comlucy.yong@huawei.com