Network Working Group K. Kompella Internet-Draft Juniper Networks Updates: 3031 (if approved) S. Amante Intended status: Standards Track Level 3 Communications, LLC Expires: January 8, 2009 July 7, 2008 The Use of Entropy Labels in MPLS Forwarding draft-kompella-mpls-entropy-label-01 Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on January 8, 2009. Kompella & Amante Expires January 8, 2009 [Page 1] Internet-Draft MPLS Entropy Labels July 2008 Abstract Load balancing is a powerful tool for engineering traffic across a network. This memo suggests ways of improving load balancing across MPLS networks using the notion of "entropy labels". It defines the concept, describes why they are needed, suggests how they can be used, and enumerates properties of entropy labels that allow optimal benefit. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Conventions used . . . . . . . . . . . . . . . . . . . . . 6 2. Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3. Entropy Labels . . . . . . . . . . . . . . . . . . . . . . . . 8 4. Forwarding and Load Balancing Behaviors for Entropy Labels . . 9 4.1. Ingress LSR . . . . . . . . . . . . . . . . . . . . . . . 9 4.2. Transit LSR . . . . . . . . . . . . . . . . . . . . . . . 9 4.3. Egress LSRs . . . . . . . . . . . . . . . . . . . . . . . 10 5. Signaling for Entropy Labels . . . . . . . . . . . . . . . . . 11 5.1. LDP Signaling . . . . . . . . . . . . . . . . . . . . . . 11 6. Security Considerations . . . . . . . . . . . . . . . . . . . 12 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 8.1. Normative References . . . . . . . . . . . . . . . . . . . 14 8.2. Informative References . . . . . . . . . . . . . . . . . . 14 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 Intellectual Property and Copyright Statements . . . . . . . . . . 16 Kompella & Amante Expires January 8, 2009 [Page 2] Internet-Draft MPLS Entropy Labels July 2008 1. Introduction Load balancing, or multi-pathing, is an attempt to balance traffic across a network by allowing the traffic to use several paths, not just a single shortest path. Load balancing has several benefits: it eases capacity planning; it can help absorb traffic surges by spreading them across several links; it allow better resilience by offering alternate paths should a link or node fail. As providers scale their networks, they resort to a small number of techniques to achieve greater bandwidth between nodes and, subsequently, depend on load-balancing of traffic over those paths. Two widely used techniques are: Link Aggregation (LAG) and Equal-Cost Multi-Path (ECMP). LAG is only used to bond together several physical circuits between two adjacent nodes so they appear to higher-layer protocols as a single, higher bandwidth "virtual" pipe. On the other hand, ECMP is used between two nodes, separated by one or more hops, to allow load-sharing over more than just the shortest path in the network -- this is typically obtained by arranging IGP metrics such that there are several equal cost paths between source- destination pairs. In summary, both of these techniques may, and oftentimes do, co-exist in various parts of a given providers network, depending on various choices made by the provider. A very important consideration when load balancing is that packets belonging to a given "flow" MUST be mapped to the same path, i.e., the same exact sequence of links across the network. This is to avoid jitter, latency and re-ordering issues for the flow. However, what constitutes a flow varies considerably. A common example of a flow is a TCP session. Other examples are L2TP sessions corresponding to broadband users, or traffic within an ATM virtual circuit. A flow is usually defined, for the purposes of forwarding and load balancing, by a hash computed on packet headers such that packets belonging to a given flow map to the same hash value. The fields chosen for such a hash depend on the packet type; a typical set (for IP packets) is the IP source and destination address, the protocol type, and (for TCP and UDP traffic) the source and destination port numbers. A conservative choice of fields leads to many flows mapping to the same hash value (and consequently poor load balancing); an overly aggressive choice may map a flow to multiple values, potentially causing the issues mentioned above. For MPLS networks, most of the same principles (and benefits) apply. However, finding useful fields in a packet for the purpose of load balancing can be more of a challenge. In many cases, the extra encapsulation may require fairly deep inspection of packets to find these fields at every hop. An idea for removing the need for this deep inspection is to extract this information *once*, at the ingress Kompella & Amante Expires January 8, 2009 [Page 3] Internet-Draft MPLS Entropy Labels July 2008 of an MPLS Label Switched Path (LSP), and encode, within the label stack itself, in addition to the forwarding semantics of the label stack, the load balancing information. This information can then be used on all MPLS hops across the network. There are three key reasons why this is beneficial: 1. at the ingress of the LSP, MPLS encapsulation hasn't yet occurred, so deep inspection is not necessary; 2. the ingress of an LSP has more context and information about incoming packets than transit nodes; and 3. ingress nodes usually operate at lower bandwidths than transit nodes, allowing them to do more work per packet. This memo describes a few approaches to solving this problem, and focuses on one method, which uses the notion of entropy labels. This memo goes on to define entropy labels, and describes why they are needed, and the properties of entropy labels in the forwarding plane: how they are generated and received and what is expected of transit Label Switching Routers (LSRs). Finally, it describes in general how signaling works and what needs to be signaled, as well as specifics for LDP. 1.1. Motivation MPLS is very successful generic forwarding substrate that may transport several dozen types of protocols, most notably: IP, PWE3, VPLS and IP VPN's. Within each type of protocol, there typically exist several variants as it relates to load-sharing, e.g.: IP: IPv4, IPv6, IPv6 in IPv4, etc.; PWE3: Ethernet, ATM, Frame-Relay, etc. There are also several different types of Ethernet over PW encapsulation, ATM over PW encapsulation, etc. as well. Finally, given the popularity of MPLS, it is likely that it will continue to be extended to transport new protocols as the need arises. Currently, each MPLS LSR along a given path needs to individually infer the underlying protocol within a MPLS packet in order to then extract appropriate keys from the payload. Those keys are then used as input into a hash algorithm to determine the specific output interface on a LSR that is used for that given "microflow". Unfortunately, if the MPLS LSR is unable to infer the MPLS packet's payload (as is often the case), they typically will resort to using the topmost MPLS labels in the MPLS stack as keys to the load-hashing algorithm. The result is an extremely inequitable distribution of traffic across multiple equal-cost paths exiting that node, simply because the topmost MPLS labels are very coarse-grained forwarding labels that typically describe a next-hop, or provide some other type Kompella & Amante Expires January 8, 2009 [Page 4] Internet-Draft MPLS Entropy Labels July 2008 of mux/demux forwarding function, and do not describe the granularity of the underlying traffic. On the other hand, ingress MPLS LER's (PE routers) have detailed knowledge of an MPLS packet's contents, typically through a priori configuration of encapsulation(s) that are expected at a given PE-CE interface, (e.g.: IPv4, IPv6, VPLS, etc.). PE routers need this information to: a) discern the packet's CoS forwarding treatment, b) apply filters to forward or block traffic to/from the CE; c) to forward routing/control traffic to an onboard management processor; or, d) load-share the traffic on its uplinks to P routers. By knowing the expected encapsulation types, an ingress PE router could apply a smaller subset of payload parsing routines to extract keys appropriate for the given protocol. Ultimately, this should allow for significantly improved accuracy in determining the appropriate load-balancing behavior for each protocol. In addition, compared to MPLS LSR's, PE routers typically operate at lower forwarding rates as well as have more flexible forwarding hardware. As a result, a PE router can typically adapt much more quickly to new/emerging protocols and determine the appropriate keys used for load-sharing traffic that type of traffic through the network. An additional advantage of applying entropy labels only at the edge of the network, on PE routers, would be that core/transit MPLS LSR's could once again return to being completely oblivious to the contents of each MPLS packet, and only use the outer MPLS labels to determine forwarding and forwarding treatment of MPLS packets. Specifically, there will be no reason to duplicate, from MPLS LER's, extremely complex packet/payload parsing functionality within MPLS LSR's and attempt to keep to keep this functionality at parity across all network elements, e.g.: both MPLS LSR's and LER's. Ultimately, this should result in less complexity within core LSR's allowing them to more easily scale to higher forwarding rates, larger port density, consume less power, etc. Finally, the approach discussed in this memo would allow for more rapid deployment of new protocols, since MPLS LSR's will not have to be developed or modified to understand how to properly extract keys to achieve good load-sharing of traffic throughout the network. In summary, MPLS LSR's are ill-equipped to infer the protocol within a packet's payload and choose appropriate keys within the payload to correctly identify a given "microflow", which is required to provide the most equitable load-sharing over multiple equal cost paths. On the other hand, PE routers have both the knowledge and capabilities to more accurately determine the load-sharing treatment that should be applied to a given protocol encapsulated within MPLS by MPLS Kompella & Amante Expires January 8, 2009 [Page 5] Internet-Draft MPLS Entropy Labels July 2008 LSR's. 1.2. Conventions used The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Labels stacks are denoted , which L1 is the "outermost" label and L3 the innermost (closest to the payload). Packet flows are depicted left to right, and signaling is shown right to left (unless otherwise indicated). Kompella & Amante Expires January 8, 2009 [Page 6] Internet-Draft MPLS Entropy Labels July 2008 2. Approaches There are two main approaches to encoding load balancing information in the label stack. The first allocates multiple labels for a particular Forwarding Equivalance Class (FEC). These labels are equivalent in terms of forwarding semantics, but having several allows flexibility in assigning labels to flows from the same FEC. The other approach encodes the load balancing information as a separate label in the label stack. Here, there are two sub- approaches, based on whether this load-balancing label is signaled or not. The first approach has the advantage that the label stack stays the same depth whether using label-based load balancing or not; and so, consequently, do forwarding operations on transit and egress LSRs. However, it has a major drawback in that signaling and forwarding state are both increased significantly. The number of independent choices for load balancing packets belonging to a FEC limits the effectiveness of load balancing, so one would like this number to be large. However, the larger this number is, the greater the signaling and forwarding state in the network. The second approach increases the size of the label stack by one label. This consequently affects operations on ingress, transit and egress LSRs. The sub-approach of signaling the load-balancing labels increases signaling and forwarding state, and so suffers from some of the problems of the first approach. The approach advocated by this memo, and the only one described in detail, is the one where the load-balancing labels are not signaled. With this approach, there is minimal change to signaling state for a FEC; also, there is no change in forwarding operations in transit LSRs, and no increase of forwarding state in any LSR. The only purpose of these labels is to increase the entropy in the label stack, so they are called "entropy labels". Kompella & Amante Expires January 8, 2009 [Page 7] Internet-Draft MPLS Entropy Labels July 2008 3. Entropy Labels An entropy label (as used here) is a label: 1. that is not used for forwarding; 2. that is not signaled; and 3. whose only purpose in the label stack is to provide "entropy" to improve load balancing. Entropy labels are generated by an ingress LSR, based entirely on load balancing information. However, they MUST not have values in the reserved label space (0-15). Entropy labels MUST be at the bottom of the label stack, and thus the "end-of-stack" bit in the label should be set. To ensure that they are not used inadvertently for forwarding, entropy labels SHOULD have a TTL of 0. Since entropy labels are generated by the ingress LSR, an egress LSR MUST be able to tell unambiguously that a given label is an entropy label. This of course depends on the underlying application. If any ambiguity is possible, the label above the entropy label MUST be an "entropy label indicator" (ELI), which says that the following label is an entropy label. The ELI may be signaled, or may be a reserved label reserved specifically for this purpose. Fortunately, for many applications, the use of entropy labels is unambiguous, and does not need an ELI. Applications for MPLS entropy labels include pseudowires ([RFC4447], [I-D.bryant-filsfils-fat-pw]), Layer 3 VPNs ([RFC4364]), VPLS ([RFC4761], [RFC4762]) and tunnel LSPs. This memo specifies general properties of entropy labels, and the signaling of entropy labels for LDP ([RFC3036]) tunnel LSPs. Other memos will specify the signaling and use of entropy labels for specific applications. Kompella & Amante Expires January 8, 2009 [Page 8] Internet-Draft MPLS Entropy Labels July 2008 4. Forwarding and Load Balancing Behaviors for Entropy Labels 4.1. Ingress LSR Suppose that for a particular application (or FEC), an ingress LSR X has to push label stack , where TL is the "tunnel label" and AL is the application label. (Note the use of the convention for label stacks described in Section 1.2. The use of a two-label stack is just for illustrative purposes.) Suppose furthermore that X is to use entropy labels for this application. Thus, the resultant label stack will be , where EL is the entropy label. When a packet for this FEC arrives at X, X must first determine the fields that it will use for load balancing. Typically, X will then generate a hash H over those fields. X will then pick an outgoing label stack to push on the packet. However, X must also generate an entropy label EL (based either directly on the load balancing fields, or on the hash H). EL is a "regular" 32-bit label, encoded in the usual way; however, the EOS bit MUST be 1 and the TTL field MUST be 0. X then pushes on to the packet before forwarding it to the next LSR. If X is told (via signaling) that it must use an entropy label indicator ELI, then X instead pushes on to the packet. Note that ingress LSR X MUST NOT include an entropy label unless the egress LSR for this FEC has indicated that it is ready to receive entropy labels. Furthermore, if the egress LSR has signaled that an ELI is needed, then X MUST include the ELI with the entropy label; otherwise, X MUST NOT use entropy labels. 4.2. Transit LSR Transit LSRs have no change in forwarding behavior. For load balancing, transit LSRs SHOULD use the whole label stack (e.g., for computing the load balance hash). Transit LSRs MAY choose to look beyond the label stack for further load balancing information; however, if entropy labels are being used, this may not be very useful. In a mixed environment (or for backward compatibility), this is the simplest approach. Thus, transit LSRs are almost unaffected by the use of entropy labels. If transit LSRs were programmed to use a subset of the label stack, they may have to be reconfigured to use the full stack. But otherwise, no changes are needed. Kompella & Amante Expires January 8, 2009 [Page 9] Internet-Draft MPLS Entropy Labels July 2008 4.3. Egress LSRs An ingress LSR X MUST NOT send entropy labels to an egress LSR Y unless Y has signaled its readiness to receive such labels. Y must also determine (for a particular application or FEC), whether it can distinguish whether the ingress has added an entropy label or not; if Y cannot do so, Y MUST request that an ELI be used for this FEC. Alternatively, Y MUST require the use of entropy labels. (See Section 5 for more details on signaling.) Suppose Y has signaled that it is prepared to receive entropy labels for a given FEC. In this case, Y must be able to distinguish whether an ingress LSR has inserted an entropy label or not based solely on the 'end-of-stack' (EOS) bit on the application label for this FEC. When Y receives a packet with this application label, then Y looks to see if the EOS bit is set. If not, Y assumes that the label below is an entropy label and pops it. Y MAY choose to ensure that the entropy label has its EOS bit set and TTL=0. Y then processes the packet as usual. Implementations may choose to the order in which they apply these operations, but the net result should be as specified. Kompella & Amante Expires January 8, 2009 [Page 10] Internet-Draft MPLS Entropy Labels July 2008 5. Signaling for Entropy Labels Signaling for entropy labels exchanges three types of information: 1. whether an LSR Y is prepared to receive entropy labels, or that Y MUST receive entropy labels, 2. whether receiving LSR Y requires ELIs with entropy labels, and if so, what label to use as the ELI, and 3. whether an LSR X is able to send entropy labels. The uses of this information can be illustrated as follows. If an LSR Y is prepared to receive entropy labels for an application (or FEC), it signals that to the ingress LSR(s). That means that an ingress LSR for this application MAY send an entropy label for this application; Y MUST be able to distinguish whether or not an entropy label was sent based solely on the EOS bit on the application label. If this is not the case, Y can choose one of two approaches. Y can signal that an ELI MUST be used for this FEC; Y may also signal what ELI to use. In this case, an ingress LSR will either not send an entropy label, or push the ELI before the entropy label. This makes the use/non-use of an entropy label unambiguous. However, this also increases the size of the label stack. An alternative approach is that Y signals that entropy labels MUST be used. An ingress LSR MUST acknowledge that it will do so (via signaling); if an ingress LSR cannot do so, the signaling for this application MUST renegotiate to not use entropy labels (or fail). The specific protocols and encoding details for the above will depend on the underlying application; see [I-D.bryant-filsfils-fat-pw] for an example for pseudowires. 5.1. LDP Signaling TBD Kompella & Amante Expires January 8, 2009 [Page 11] Internet-Draft MPLS Entropy Labels July 2008 6. Security Considerations Having security is a Good Thing. Kompella & Amante Expires January 8, 2009 [Page 12] Internet-Draft MPLS Entropy Labels July 2008 7. Acknowledgments We wish to thank Ulrich Drafz for his contributions, as well as the entire "hash label" team for their valuable comments and discussion. Kompella & Amante Expires January 8, 2009 [Page 13] Internet-Draft MPLS Entropy Labels July 2008 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 8.2. Informative References [I-D.bryant-filsfils-fat-pw] Bryant, S., Filsfils, C., and U. Drafz, "Load Balancing Fat MPLS Pseudowires", draft-bryant-filsfils-fat-pw-01 (work in progress), February 2008. [RFC3036] Andersson, L., Doolan, P., Feldman, N., Fredette, A., and B. Thomas, "LDP Specification", RFC 3036, January 2001. [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private Networks (VPNs)", RFC 4364, February 2006. [RFC4447] Martini, L., Rosen, E., El-Aawar, N., Smith, T., and G. Heron, "Pseudowire Setup and Maintenance Using the Label Distribution Protocol (LDP)", RFC 4447, April 2006. [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, January 2007. [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, January 2007. Kompella & Amante Expires January 8, 2009 [Page 14] Internet-Draft MPLS Entropy Labels July 2008 Authors' Addresses Kireeti Kompella Juniper Networks 1194 N. Mathilda Ave. Sunnyvale, CA 94089 US Email: kireeti@juniper.net Shane Amante Level 3 Communications, LLC 1025 Eldorado Blvd Broomfield, CO US Email: shane@level3.net Kompella & Amante Expires January 8, 2009 [Page 15] Internet-Draft MPLS Entropy Labels July 2008 Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Kompella & Amante Expires January 8, 2009 [Page 16]