PCE Working Group Shankar Raman Internet-Draft Balaji Venkat Venkataswami Intended Status: Experimental RFC Gaurav Raina Expires: August 2013 IIT Madras February 18, 2013 Constructing power optimal P2MP TE-LSPs within an AS draft-mjsraman-pce-power-replic-02 Abstract Power consumption in multicast replication operations is an area of concern and choosing suitable replication points that can decrease power consumption overall assumes importance. Multicast replication capacity is an attribute of every line card of major routers and multi-layer switches that support multicast in the core of an Internet Service Provider (ISP) or an enterprise network. Currently multicast replication points on Point-to-Multipoint Traffic Engineering Label-Switched-Paths (P2MP TE-LSPs) consume power while delivering multiple output streams of data from a given input stream. The multicast distribution trees are constructed without any regard for a proper placement of the replication points and consequent optimal power consumption at these points. This results in overloading certain routers while under-utilizing others. An optimal usage of these replication resources could substantially reduce power consumption on these routers. In this paper, we propose a mechanism by which P2MP TE-LSPs are constructed for carrying multicast traffic across multiple areas within a given AS. We propose that these LSPs be built by using the advertisements of the power-replication capacity ratio advertised by fine grained components such as multicast capable line-cards of routers and multi- layer switches deployed within an AS. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Shankar Raman et.al. Expires August 2013 [Page 1] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Methodology of the proposal . . . . . . . . . . . . . . . . . 4 2.1 Discussion of this scheme . . . . . . . . . . . . . . . . . 6 2.2 Power to available multicast replication capacity ratio in a TLV . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 Security Considerations . . . . . . . . . . . . . . . . . . . . 11 4 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 11 5 References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.1 Normative References . . . . . . . . . . . . . . . . . . . 11 5.2 Informative References . . . . . . . . . . . . . . . . . . 11 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 Shankar Raman et.al. Expires August 2013 [Page 2] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 1 Introduction Multicast traffic across multiple areas within a given AS, may be carried using P2MP TE-LSPs. The traffic may be carried from a ingress Provider Edge (PE) router to several egress PEs, example in a multicast Virtual Private Network (MVPN) case. The autonomous system (AS) may comprise of multiple areas involving a backbone area and several non-backbone areas connected to each other through the backbone. If several such multicast streams are to be carried in the AS, it would be most useful to have such P2MP TE-LSPs constructed such that they have optimal power to available replication capacity ratios on the routers' linecards that they traverse from source to destinations. The intent is to provide a solution whereby several such P2MP TE-LSPs can be laid out in such a way that the set of routers that replicate multicast traffic traversed by the P2MP TE- LSPs are most optimal in the utilization of the power provided to them given that there is sufficient replication capacity available. This we believe would essentially lead to a equilibrium of power to available replication capacity ratios amongst all routers in the topology which in turn would optimize and reduce the overall ratios for the AS. Each router and its respective linecards deployed in the AS have an advertised capability for replication. Most multi-layer switches and routers from vendors advertise in their respective data sheets a certain capability for replication for each type of linecard deployable on the box. Replication consumes power and delivers multiple streams of data from a given input stream. It is status quo that P2MP (Point-to-Multipoint) Label Switched Paths are constructed without taking into account the power to available replication capacity ratios of such routers thus overloading certain routers while underutilizing the others. An optimal usage of these resources could reduce power consumption on these routers / multi-layer switches. This equilibrium could be arrived at by using a capability to advertize from each router a Traffic Engineering Database Link State Advertisement (TED-LSA) that carries the power to available replication capacity ratio of each of the said router's line cards, depending on the current utilization of its replication capacity and power consumption. This paper is organized as follows; In section 2, we deal with the scheme that we propose. In section 2.1, we discuss some examples of the scheme at work, and in section 3 we conclude with future areas of study that may be useful to undertake. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", Shankar Raman et.al. Expires August 2013 [Page 3] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Methodology of the proposal The key metric under consideration is the power consumed DIVIDED BY available replication capacity on each of the linecards of a router in the AS, which is eligible to be used as a node atop which multicast traffic can be carried. Once an advertisement about the said metric has been sent in the regular flooding process in Link State routing protocols such as OSPF-TE or ISIS-TE, it would be possible for a head-end router for a P2MP TE-LSP to compute the TE- LSP through the AS from the ingress PE to all egress PEs of that multicast stream in such a way that the power to available replication capacity ratios at the replication points are minimal on that path. The Constrained Shortest Path First (CSPF) algorithm could be modified to compute the least cost power to available replication capacity ratio path and thus cause an equilibrium shift to be caused. This path would be supplied to the RSVP-TE component of the head-end and that would set up the path with appropriate labels. Once RSVP-TE establishes the path and traffic is carried across it, the reduced replication capacity of the routers in the P2MP TE-LSP path would be re-advertised again, which in turn would be useful for computation of the other paths from the instance that the replication capacity changed on these routers. Assume that the following router topology in the vicinity of the sender / senders is computed. +----------------+ / V / +----> (R2) ------> (R3)<--(RcvrB) / / \ | / / \ +------+ / / \V (source/s)--->(R1)------> (R5) ------> (R4)<-(RcvrA) \ /\ / \ ----------+ \ / \ / \ / (R6)----> (R7) --------> (R8)<-+ Figure 1: Topology within a given AS with coloring for Power- replication ratios Shankar Raman et.al. Expires August 2013 [Page 4] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 In the above diagram you can see that the source/sources are connected using a multi homed connections to the same ISP through Routers R1 nd R2. Similarly there are two Receiver sites RcvrA and RcvrB that are multihomed to TWO Routers RcvrB to R3 and R4 and for RcvrA to R4 and R8 respectively. +----------------+ / V / +----> (R2) ------> (R3)<--(RcvrB) / / \ | / / \ +------+ / / \V (source/s)...>(R1)------> (R5) ------> (R4)<-(RcvrA) . .\ / . ........... \ / . . \ / (R6)....> (R7) ........> (R8)<-+ Legend : dotted lines represent path computed. Figure 2: Instantiating an optimal power consuming distribution tree Given that the path calculation engine at the head-end R1 is given this topology and along with other TED-LSA packets the current power to available replication capacity ratios are advertised through the IGP-TE extensions to the head-end R1, the paths with the least power to available replication capacity ratios are computed and the paths setup from head-end PEs to the tail-end PEs where the recievers are connected. It is to be noted that the ratios computed for power to available replication capacity on the topology are examined and the replication points are setup on those routers that have the least power to available replication capacity ratio. If branching points are not required at certain points, these are anyways placed on least cost power ratio routers that are the next best location to setup a non-branching point. Assume the following path is computed as per the least power to available replication capacity ratios. Paths are computed through R6, R7, R8, R4, and say the multicast stream occupies 4GB of traffic along this tree so constructed and the available capacity of these routers reduces to 6GB assuming all of them have a base capacity of 10GB. Subsequent paths constructed would have to take into account the newly computed power to current replication capacity ratio in the topology and construct new P2MP TE-LSPs for multicast streams yet to come. Assume another 6GB worth of traffic is loaded onto this topology in terms of a multicast stream / multiple streams then the new path Shankar Raman et.al. Expires August 2013 [Page 5] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 computed for these new streams would possibly utilize the same path as computed before. If the old streams reduce the replication capacity to an extent such that routers through which they pass can no longer be used since these routers' power to available replication capacity has become poor when compared to other paths then a different path may be computed from the ingress PE to the egress PEs in such a way as to avoid those routers which have such poor ratios. For example, assume R6, R7, R8 and R4 have exhausted their capacity, or guzzle more power as a result of them carrying the 4GB stream that was originally placed atop them. then a different path would be chosen as follows. The path followed as shown in the Figure is R2,R3 and R4. Given that R4 is the only choice since it has connectivity to both Receivers, in this case the branch point is placed atop R3, one branch to get to RcvrB and the other to get to RcvrA through R4. Policy decisions could guide the placement in case of a tie. Here the the only choice has been to drive the end replication to RcvrA through R4 and RcvrB through R3 owing to topology constraints. It is to be noted that the power consumed by the linecard is divided by the available replication capacity to arrive at a ratio and that ratio is assigned as a weight to all of the links ingressing on that linecard. It is possible that one might take a weighted average by dividing a weighted co-efficient sum by the weighted sum of ingress links on a linecard and the metrics so assigned be used as the metric for calculation. .................. . V . +----> (R2)........> (R3)<--(RcvrB) . / . | . / . +------+ . / .V (source/s)--->(R1)-------> (R5) -------> (R4)<-(RcvrA) \ /\ / \ ----------+ \ / \ / \ / (R6)-----> (R7) -------> (R8)<+ Legend : dotted lines represent path computed. Figure 3: Instantiating a subsequent optimal power consuming distribution tree 2.1 Discussion of this scheme It is to be noted that our scheme applies to centralized schemes of path calculations. What is being calculated is a tree of nodes that Shankar Raman et.al. Expires August 2013 [Page 6] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 form a P2MP tree where each node can conceptualized as a router (read also multi-layer switches) and each edge the link connecting one or more ports on a line card to another linecard on a downstream router to carry multicast traffic from a source located at the head end ingress router to several receiver nodes connected to egress routers. We will call this calculated tree as a P2MP tree. The tree is calculated by the PCE in the head end / ingress router through which sources connect. The PCE calculates the intra-AS P2MP path (the literal P2MP TE-LSP within the AS) within that AS. The calculated power to available replication capacity ratio is assigned to each of the ingress links on a linecard on a router en- route to egress links through which the multicast stream is replicated on the same router. Thus all ingress links to a router through a linecard are assigned the same metric as the power ratio so calculated. The egress links would in continuity connect to a unicast tunnel or another branch-point in the tunnel towards the receivers which are represented as the egress routers. The egress routers would in turn be replication points or direct connections to the actual receivers. This method could be applied for multicast traffic to be transported through MVPNs. The method of egress routers' discovery is left to existing mechanisms. The primary input to the invention proposed is an ingress router and their respective egress routers. The other input to the construction of P2MP tree is the router level topology with the metrics for the power to available replication capacity ratio. It is to be noted that this CSPF calculation can be hastened in terms of time complexity by dividing the weights into equivalence classes. First we divide the nodes into graph colored nodes with the least ratio nodes marked as green as shown in the figure and given that there exists a path that is all green from source to egress PEs, one of such paths is chosen. If after coloring the nodes a path which is disconnected exists, we incrementally add the next best colored nodes to the graph to see if we a get a connected path from source to egresses. These steps are repeated until we find a connected path. This will hasten the algorithm to a conclusion rather than use a brute force method which may take inordinate amount of time. R4 being used in the 6GB case is an example of this. Because of topology restrictions the R4 node had to be chosen inspite of the fact that it is not green after carrying the 4GB stream. Routers may have step levels in which they increase power consumption when they additively are loaded with more large bandwidth consuming multicast streams. Calibrating these levels may be useful for implementing this scheme. It is possible that such calibrated thresholds can be used for advertising the power to available replication capacity ratios in the IGP-TE advertisements. This would Shankar Raman et.al. Expires August 2013 [Page 7] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 be useful for bringing down the frequency of updates or advertisements from a line-card about its ratios. When power consumption meanders within a certain given interval these ratios need not be readvertised even if further multicast streams are added to it. The incentive is to recognize a linecard that does not drastically change power consumption even if large bandwidth streams are added onto it for replication and thus give it credit for its power optimal functioning. If a router tends to consume the highest level of power even when carrying low amounts of multicast streams and replicating them on its line card, it would automatically have a poor ratio when compared to a router that efficiently uses power when considering the replication capacity being used. The best case would be a low power consuming line-card or a router filled with such line cards that does not leave its power interval no matter how much ever replication capacity is sought to be used on it. But that would be an ideal condition but it is definitely an idealistic scenario towards which the router manufacturers should look at. It is possible that several multicast streams may be aggregated onto a single P2MP-TE-LSP representing the given multicast tree that encompasses the union of all the egress PEs of the several multicast streams. The Ingress PE router is however common for all the multicast streams so covered. Aggregation of these several multicast streams from a given Ingress PE to several egress PEs is a common occurrence to save the amount of state in the core of the network. By aggregating these streams onto a single P2MP tree, it is possible to amortize the cost of replication amongst a particular set of ingress linecards / ports on those line cards while taking into account the current power consumption and replication capacity available at the time of computing the P2MP TE-LSP. The dynamic nature of the multicast tree and the egress PEs that join into it and leave it based on whether there are multicast listeners in that VPN site attached to the said egress PE/ PEs, makes it important to position the replication points in such a way that there is maximum leverage on optimization in the ratios overall for the AS which are computed. When aggregating multiple multicast streams over a single P2MP TE-LSP it is important to keep this in mind. So the key point is to aggregate multiple streams with a set theoretical approach in mind so that there is maximum overlap of egress PEs for these streams and position these streams atop a P2MP TE-LSP in such a way that ratios are most optimal for that set of streams (with the overall AS power consumption in mind). 2.2 Power to available multicast replication capacity ratio in a TLV As per [RFC3630] the Link TLV can be used to carry this power to Shankar Raman et.al. Expires August 2013 [Page 8] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 available multicast replication capacity ratio with an additional sub-TLV of the link TLV. The sub-type number 10 is recommended to be defined for this purpose. [RFC 3630] states in section 2.2.1 and we QUOTE ... 2.2.1 Link TLV The Link TLV describes a single link. It is constructed of a set of sub-TLVs. There are no ordering requirements for the sub-TLVs. Only one Link TLV shall be carried in each LSA, allowing for fine granularity changes in topology. The Link TLV is type 2, and the length is variable. The following sub-TLVs of the Link TLV are defined: 1 - Link type (1 octet) 2 - Link ID (4 octets) 3 - Local interface IP address (4 octets) 4 - Remote interface IP address (4 octets) 5 - Traffic engineering metric (4 octets) 6 - Maximum bandwidth (4 octets) 7 - Maximum reservable bandwidth (4 octets) 8 - Unreserved bandwidth (32 octets) 9 - Administrative group (4 octets) 10 - Power-to-Multicast-replication-capacity (4 octets) This memo defines sub-Types 1 through 9. See the IANA Considerations in [RFC3630] section for allocation of new sub-Types. The Link Type and Link ID sub-TLVs are mandatory, i.e., must appear exactly once. All other sub-TLVs defined here may occur at most once. These restrictions need not apply to future sub-TLVs. Unrecognized sub-TLVs are ignored. Various values below use the (32 bit) IEEE Floating Point format. For quick reference, this format is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| Exponent | Fraction | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ S is the sign, Exponent is the exponent base 2 in "excess 127" notation, and Fraction is the mantissa - 1, with an implied binary Shankar Raman et.al. Expires August 2013 [Page 9] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 point in front of it. Thus, the above represents the value: (-1)**(S) * 2**(Exponent-127) * (1 + Fraction) It is proposed that we use the Power-to-multicast-replication- capacity ratio as a 32 bit IEEE floating Point format field for the purpose of this document. 3 Conclusion Here we propose a scheme that takes into account the power to available replication capacity ratios as weights for the edges and compute a low cost power path for multicast replication. This scheme could be extended to inter-AS multicast streams or to inter-AS multicast streams where the multicast stream is sought to be carried over multiple ASes. This is an area of future study which would be most conducive in terms of bringing about optimal power usage and thus incentivising vendors to manufacture low power consuming equipment. Compelled to bring about radical change in the thinking relating to power consumption vendors manufacturing networking equipment will drive down power consumption since the scheme proposed chooses or gives priority to low power guzzling linecards. Shankar Raman et.al. Expires August 2013 [Page 10] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 3 Security Considerations The security considerations for this proposal are the same as any NEW opaque LSA introduced in an IGP like OSPF, IS-IS. 4 IANA Considerations IANA would need to assign a NEW opaque LSA type to carry power and multicast replication capacity such that this information can be carried in the TE-LSAs within an AS. 5 References 5.1 Normative References 5.2 Informative References [1] G. Appenzeller, Sizing router buffers, Doctoral Thesis, Department of Electrical Engineering, Stanford University, 2005. [2] A. P. Bianzino, C. Chaudet, D. Rossi and J. L. Rougier, A survey of green networking research, IEEE Communications and Surveys Tutorials, preprint. [3] J. Baliga, K. Hinton and R. S. Tucker, Energy consumption of the internet, Proc. of joint international conference on optical internet, June 2007, pp. 1993. [4] J. Chabarek, J. Sommers, P. Barford, C. Estan, D. Tsiang and S. Wright, Power awareness in network design and routing, Proc. of the IEEE INFOCOM 2008, April 2008, pp. 457-465. [5] M. Xia et. al., Greening the optical backbone network: A traffic engineering approach, IEEE ICC Proceedings, May 2010, pp. 1995. [6] W. Lu and S. Sahni, Low-power TCAMs for very large forwarding tables, IEEE/ACM Transactions on Computer Networks, June 2010, vol. 18, no. 3, pp. 948-959. [7] B. Zhang, Routing Area Open Meeting, Proceedings of the IETF 81, Quebec, Canada, July 2011. Shankar Raman et.al. Expires August 2013 [Page 11] INTERNET DRAFT Constructing Power Optimal P2MP-LSPs February 18, 2013 Authors' Addresses Shankar Raman Department of Computer Science and Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: mjsraman@cse.iitm.ac.in Balaji Venkat Venkataswami Department of Electrical Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: balajivenkat299@gmail.com Prof.Gaurav Raina Department of Electrical Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: gaurav@ee.iitm.ac.in Shankar Raman et.al. Expires August 2013 [Page 12]