RTGWG Working Group Shankar Raman Internet-Draft Balaji Venkat Venkataswami Intended Status: Experimental RFC Gaurav Raina Expires: November 5, 2013 IIT Madras May 4, 2013 Building power shortest inter-Area TE LSPs using pre-computed paths draft-mjsraman-rtgwg-intra-as-psp-te-leak-03 Abstract In this paper, we propose a framework to reduce the aggregate power consumption of an Autonomous System (AS) using a collaborative approach between areas within an AS. We identify the low-power paths within non-backbone areas and then use Traffic Engineering (TE) techniques to route the packets along the stitched paths from non- backbone areas / backbone area to other non-backbone areas. Such low- power paths can be identified by using the power-to-available- bandwidth (PWR) ratio as an additional constraint in the Constrained Shortest Path First (CSPF) algorithm. For routing the data traffic through these low-power paths, the Inter-Area Traffic Engineered Label Switched Path (TE-LSP) that spans multiple areas can be used. Extensions to the Interior Gateway Protocols like OSPF and IS-IS that support TE extensions can be used to disseminate information about low-power paths in the respective areas (backbone or non-backbone) that minimize the PWR ratio metric on the links within the areas and between the areas thereby creating a collaborative approach to reduce the power consumption. The feasibility of our approaches is illustrated by applying our algorithm to an AS with a backbone area and several non-backbone areas. The techniques proposed in this paper for the Inter-Area power reduced paths require a few modifications to the existing features of the IGPs supporting TE extensions. The proposed techniques can be extended to other levels of Internet hierarchy, such as Inter-AS paths, through suitable modifications as in [11]. When link state routing protocols like OSPF or ISIS are used to discover TE topology, there is the limitation that traffic engineered paths can be set up only when the head and tail end of the label switched path are in the same area. There are solutions to overcome this limitation either using offline Path Computation Engine (PCE) that attach to multiple areas and know the topology of all areas. This document proposes an alternative approach that does not require any centralized PCE and uses selective leaking of low-power TE path information from one area into other areas. Shankar Raman et.al Expires November 5, 2013 [Page 1] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/1id-abstracts.html The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright and License Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1 Low-power routers and switches . . . . . . . . . . . . . . . 4 1.2 Power reduction using routing and traffic engineering . . . 4 2. Methodology of the proposal . . . . . . . . . . . . . . . . . 6 2.1 ABR Operation . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . 7 Shankar Raman et.al Expires November 5, 2013 [Page 2] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 2.1.2 ERRATA . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.3 Power Bias . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.4 Advertising Available POWER . . . . . . . . . . . . . . 12 2.1.5 ECMP links . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.6 Dampening the side effects of constant change . . . . . 12 2.1.7 Calculating power shortest paths in an Area . . . . . . 12 2.1.8 Power profiles of Routers and Switches . . . . . . . . . 13 2.1.8.1 Concave and Convex power curves . . . . . . . . . . 15 2.1.8.2 Need to advertise both available power and consumed power . . . . . . . . . . . . . . . . . . . 17 2.1.9 Power to Available Bandwidth ratio in a TLV . . . . . . 17 2.2 TE Path Head-end Operation . . . . . . . . . . . . . . . . . 20 2.2 Suppression of Frequent updates owing to fluctuation in power and bandwidth . . . . . . . . . . . . . . . . . . . . 22 2.3 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . 23 3 Security Considerations . . . . . . . . . . . . . . . . . . . . 24 4 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 24 5 References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.1 Normative References . . . . . . . . . . . . . . . . . . . 24 5.2 Informative References . . . . . . . . . . . . . . . . . . 24 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 Shankar Raman et.al Expires November 5, 2013 [Page 3] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 1 Introduction Estimates of power consumption for the Internet predict a 300% increase, as access speeds increase from 10 Mbps to 100 Mbps [3], [8]. Access speeds are likely to increase as new video, voice and gaming devices get added to the Internet. Various approaches have been proposed to reduce the power consumption of the Internet such as designing low-power routers and switches, and optimizing the network topology using traffic engineering methods [2]. 1.1 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 1.1 Low-power routers and switches Low-power router and switch design aim at reducing the power consumed by hardware architectural components such as transmission link, lookup tables and memory. In [4] it is shown that the router's link power consumption can vary by 20 Watts between idle and traffic scenarios. Hence the authors suggest having more line cards and running them to capacity: operating the router at full throughput will lead to less power per bit, and hence larger packet lengths will consume lower power. The two important components in routers that have received attention for high power consumption are buffers and TCAMs. Buffers are built using dynamic RAM (DRAM) or static RAM (SRAM). SRAMs are limited in size and consume more power, but have low access times. Guido [1] states that a 40Gb/s line card would require more than 300 SRAM chips and consume 2:5kW. DRAM access times prevent them from being used on high speed line cards. Sometimes the buffering of packets in DRAM is done at the back end, while SRAM is used at the front end for fast data access. But these schemes cannot scale with increasing line speeds. Some variants of TCAMs have been proposed for increasing line speeds and for reduced power consumption [7]. 1.2 Power reduction using routing and traffic engineering At the Internet level, creating a topology that allows route adaptation, capacity scaling and power-aware service rate tuning, will reduce power consumption. In [8] the author has proposed a technique to traffic engineer the data packets in such a way that the link capacity between routers is optimized. Links which are not utilized are moved to the idle state. Power consumption can be reduced by trading off performance related measures like latency. For Shankar Raman et.al Expires November 5, 2013 [Page 4] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 example, power savings while switching from 1 Gbps to 100 Mbps is approximately 4 W and from 100 Mbps to 10 Mbps around 0:1 Watts. Hence instead of operating at 1 Gbps the link speed could be reduced to a lower bandwidth under certain conditions for reduced power consumption. Multi layer traffic engineering based methods make use of parameters such as resource usage, bandwidth, throughput and QoS measures, for power reduction. In [6] an approach for reducing Intra-AS power consumption for optical networks that uses Djikstra's shortest path algorithm is proposed. The input to this method assumes the existence of a network topology using which an auxiliary graph is constructed. Power optimization is done on the auxiliary graph and traffic is routed through the low-power links. However, the algorithm expects the topology to be available for getting the auxiliary graph. This topology is easy to obtain for Intra-AS scenario, but by using a centralized PCE (Path Computation Element) as in a hierarchical PCE approach. Here for each area a PCE is assigned and each such PCE calculates the path from a head-end router to a tail-end router, both falling within the same area. When TE paths have to be stitched across several areas then the hierarchical PCE which may be one level up from the respective area PCEs is contacted for such a stitching. In our approach, we propose a collaborative approach by the respective areas in calculating low-power paths that result in power reduction within an AS. This document proposes an alternative approach that does not require any centralized PCE and uses selective leaking of low-power TE path information from one area into other areas. The core of most ISP ASes use the Multi-Protocol Label Switching (MPLS) technology. MPLS label switched paths that traverse multiple areas carry traffic from a head-end to a tail-end that can be situated in different areas within the AS. The AS uses the Interior Gateway Protocol (IGP) for exchanging routing related information. The topology of one area is not revealed to the other in OSPF-TE and IS-IS-TE. The CSPF algorithm as proposed here is run on a specific area with the available power-to-bandwidth (PWR) ratio as a constraint, to determine "k" (where k is a suitable number) low-power-paths from the head-end to the tail-end within the same area. The low-cost power paths that minimize the PWR ratio can be exchanged among the collaborating areas using IGP-TE TLVs that we propose in this document. Explicit routing using RSVP-TE (for signalling) then can be achieved between the head-end and the tail-end routers traversing multiple areas through these low-power paths connecting the head-end and tail-end using the Inter-Area Traffic Engineered Label Switched Path (TE-LSP) that span multiple areas. Shankar Raman et.al Expires November 5, 2013 [Page 5] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 2. Methodology of the proposal There are three known solutions to inter-area TE (a) hop expansion at area boundaries where the head end can only choose the path to area boundary rather than right to tail end, (b) centralized PCE is attached to all areas and is aware of entire topology, and (c) path stitching by designating ABRs acting as BGP route reflectors. It is of course possible to build out low-power paths through the above techniques but they suffer limitations such as not knowing for certain whether the path exists a-priori. This document proposes a technique where a-priori low-power paths are pre-computed in the various areas and are leaked into other areas so that provisioning these paths is done much more quicker than is otherwise possible. Assume {N} as the set of nodes in a network running link state routing protocol and {N' } be the set of nodes that are known to be the endpoints of the traffic engineered paths. The topology {N, E} has been divided into hierarchical areas with backbone area as the second level that connects first level of all non-backbone areas. We assume the network runs either OSPF-TE or ISIS-TE for establishing TE paths. The set of nodes {N'} can be situated in any non-backbone area or the backbone area. Nodes in {N'} may become aware of being potential endpoints through offline configuration. Once the nodes in {N'} become aware of being TE endpoints, they advertise themselves in a special TLV in TE link state information. We would term this "TE Endpoint TLV". In OSPF, they would advertise a newly defined TLV in TE LSA and in ISIS, they would advertise a newly defined TLV in TE LSP. Apart from nodes in {N'} the area border routers or ABRs advertise another newly defined TLV that we would term as "Area Border TLV". 2.1 ABR Operation Apart from standard OSPF/ISIS ABR functions, each ABR should discover the TE endpoints in every area attached to it. Assume for an ABR, let the set discovered be {Ai, Nj}. The ABR should compute k-power- shortest-paths to every element in {Ai, Nj} based on the constraints applicable to the network. The constraint applied here is the minimization of the PWR ratio which is defined as follows. Shankar Raman et.al Expires November 5, 2013 [Page 6] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 For a given router that is an ABR for an area (straddling the backbone and non-backbone), a set of k-shortest paths that can be potentially be used as a link towards a TE endpoint are identified. 2.1.1 Methodology For each router / switch there exist linecards and each linecard has a set of ports or sometimes just one port of high capacity. This usually applies on routers and switches that are either single chassis or multi-chassis in their characterisation. By single chassis we mean that there exists a single chassis and slots for the Route Processor Card (one or more of these) typically upto to two of them, and one or more slots for linecards each having their respective characteristics such as number of ports (port density), type of such ports (SONET, ethernet, ATM etc..) usually depending on the link layer technology they support. Links are connections between ports on these linecards to other ports on linecards of other single chassis or multi-chassis system. A multi-chassis system is one that has multiple such chassis interconnceted amongst each other to form a single logical view of the system. Both single and multi-chassis have linecards and respective ports on these linecards. Multi-chassis typically have a switch fabric chassis which connects each of these chassis to each other or to chassis of other multi-chassis or single chassis systems. Shankar Raman et.al Expires November 5, 2013 [Page 7] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 Consider the following topology as one that falls within an area... Router A Router B Router C +---+---+ +---+---+ +-------+ | | | | | | | | | |LC1|LC2| |LC1|LC2| |LC1|LC2| | | | | | | L11 | | | | P1| P1| | P1| P1|-------------- P1| P1|---+ | P2| P2|--+ | P2| P2| L12 | P2| P2| | | P3| P3| | L4 | P3| P3|-------------- P3| P3| | | P4| P4|--+----------- P4| P4| +---- P4| P4| | | P5| P5| | +----P5| P5--+ L5 | | P5| P5| | | | | | | | | | | | | | | | | | | +-|-+-|-+ |L3 | +---+---+ | | +---+-|-+ | L13 | | | +------------+-------+ | | | |L2 | L5 | | | | +----+------------+ | | | | | | | | | |L1 | | |L6 | | | | Router D | | Router E L12| | Router F | | +---+---+ | | +---+---+ | |+-------+ | | | | | |L2 | | | | | || | |L | | |LC1|LC2| | | |LC1|LC2| | ||LC1|LC2|1 | | | | | | | | | | | || | |4.. | +-| P1| P1---+ | | P1| P1|------+ || P1| P1|-> | | P2| P2| L7 +--- P2| P2| +--P2| P2|-> | | P3| P3|-------------- P3| P3| L10 | P3| P3|-> +----------| P4| P4| +---- P4| P4|-------------- P4| P4| | P5| P5| | +-- P5| P5| +----- P5| P5| | | | | | | | | | | | | | +-|-+---+ L8 | | +---+---+ L9 | +---+---+ +---------------+ +------------------+ Shankar Raman et.al Expires November 5, 2013 [Page 8] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 The table of links between the various routers (which are assumed to be single chassis systems) is as follows... +--------+----------+-----------+-----------+-----------+----------+ | Links | Routers | LC <> LC | Port Conn.| Capacity |Available | | | | | | |Bandwidth | +--------+----------+-----------+-----------+-----------+----------+ | L1 | A <> D | LC1<>LC1 | P5<>P4 | 10G | 7.5 | | L2 | A <> D | LC2<>LC2 | P5<>P1 | 10G | 6.0 | | L3 | A <> D | LC2<>LC1 | P2<>P1 | 10G | 4.0 | | L4 | A <> B | LC2<>LC1 | P4<>P4 | 10G | 3.0 | | L5 | B <> C | LC1<>LC1 | P5<>P4 | 10G | 3.5 | | L6 | B <> E | LC1<>LC1 | P6<>P2 | 10G | 1.0 | | L7 | D <> E | LC2<>LC1 | P3<>P3 | 10G | 6.0 | | L8 | D <> E | LC1<>LC1 | P5<>P4 | 10G | 1.5 | | L9 | E <> F | LC1<>LC2 | P5<>P5 | 100G | 20.0 | | L10 | E <> F | LC2<>LC1 | P4<>P4 | 10G | 2.5 | | L11 | B <> C | LC2<>LC1 | P1<>P1 | 10G | 3.0 | | L12 | E <> C | LC2<>LC2 | P1<>P5 | 10G | 2.0 | | L13 | C <> F | LC2<>LC1 | P1<>P2 | 10G | 1.0 | | L14 | F <> OA | LC2<> | P1<> | | | | | | | | | | +--------+----------+-----------+-----------+------------+---------+ In the above topology assume all point-to-point links between the routers. For now we will deal with P2P links alone and not venture into Broadcast Multi-access links or Non-Broadcast Multi-access links etc.. It is suffice to show how the scheme works for P2P links and then move more specifically to other types of networks to demonstrate this method of calculating the power topology of the network in the figure. Each linecard consumes a certain amount of power and it is vendor dependent as to how the power consumed relates to the Available Bandwidth on any of the links to which the linecard connects to. It is possible that the said topology of routers come from one vendor or from multiple vendors. It is assumed that the algorithm proposed will have the power consumed by a linecard available as a readable value in terms of W or kW or whichever measurable metric that is provided by the vendor. It is possible that some of the Linecards are more capable than the others. Consider that Router A is a more capable router with more powerful linecards with higher port density. This is not shown in the figure, but assume so. LC1, LC2 on Router A could be consuming more power than the other Linecards on other routers. The main reason could be that LC1 and LC2 may have higher port density or higher Shankar Raman et.al Expires November 5, 2013 [Page 9] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 speed ports than the other routers. In order to calculate the power consumed on a link by a linecard it is important that we normalize the power as power consumed per port. Here the ports are normalized to lowest common denominator. If all links in the topology have 10G port capacity then the power calculated should be in terms power consumed per 10G port. Assuming we have done this normalization we go on to calculate the POWER metric for each of the ports involved in a link which is derived as follows... POWER metric = Power consumed per XG (normalized bandwidth) port for a given ------------------------------------------------- Port on a LC Available Bandwidth on that port Assume link L1. The ports concerned are both 10G and the ports are P5 on Router A and P4 on Router D. For calculating the POWER metric for a link which we will call PWRLINK we calculate the POWER metric for each side of the link and average the two to get PWRLINK. So PWRLINK for L1 = POWER for P5 on LC1 + Power for P4 on LC1 on Router A on Router D ============================================ 2 The above can also be weighted if there is a multi-capacity port on one side of the link and not on the other. A multi-capacity link is one which provides multiple bandwidth capabilities such (1G/10G/100G) for example but auto-negotiates with other end to provide a lesser than highest capacity service. The PWRLINK metrices once calculated are flooded in already defined OSPF-TE-LSA as an adapted TE-metric and is typically flooded as a link characteristic. It is important to note that the denominator for POWER metric is Available Bandwidth instead of Available Bandwidth on that port. The Available Bandwidth is measured in terms of intervals and not as discrete quantities. This is in order not to flood PWRLINK metrics into the OSPF area in LSAs very frequently as Bandwidth may constantly change. The same applies to POWER metric as well. Once the LSAs have been flooded the Routers run CSPF on the graph of the topology with PWRLINKs assigned to the links and calculate the PWRLINK based paths which consume the least power. The shortest power paths based on this topology can be used for forwarding high bandwidth streams and to optimally use power within the area. Shankar Raman et.al Expires November 5, 2013 [Page 10] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 The Available Bandwidth column shows the Available bandwidth of the link corresponding to the row and column intersection. This figure is used as the numerator in the POWER metric computation for that port. 2.1.2 ERRATA ERRATA : Previously the experiments were carried out with Available Utilization since only 10G and 100G ports were considered. This baselines the metric to 10G ports and proportionality thereof. But in reality the actual Available Bandwidth needs to be considered for real world experiments. Hence this draft has been changed to reflect the Available Bandwidth to be taken as the denominator of the formula thereof. In our previous experiments the 100G link if it showed a utilization of 0.2 would end up as a high POWER metric and hence would be totally avoided. In reality this link may have been a more power optimal link given that if it had a first power profile (Please refer section on Power Profiles). Dividing the Power consumed or Available Power by the Available Bandwidth gives a better picture of how much power cost per Gb is consumed and normalizes the metric amongst links of varying bandwidth. An earlier version of this document rev-00 contained a different algorithm to compute the k-shortest-power-paths. From the experimental results gathered it was seen that the said algorithm was prone to errors with respect to direction of traffic and unnecessarily complex for the solution. Hence it has been set aside for a more simple yet better one mentioned in this revision. 2.1.3 Power Bias Assume in the figure that there exist Routers A and D and that there is a bias on the link L1 in such a way that Router D computes a POWER metric of 10 and the Router D computes a POWER metric of 2 on the ports P5 and P4 respectively. Now the PWRLINK would be 6 for that link L1. Thus even if one side is excessively power guzzling then the PWRLINK moves up and thus is less preferred in the CSPF algorithm and path computation based on the Power topology. If there is no bias and both the sides of the link are optimal in their power usage then the metric stays low even if more streams are sent on it. This is the main objective that is set out for router and switch manufacturers in the single chassis and multi-chassis world, in that they are incentivised to manufacture linecards that are not power hungry even if the number of packets flowing through them is high and thus the Bandwidth Available is also reasonably on the higher side compared to other routers. Shankar Raman et.al Expires November 5, 2013 [Page 11] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 For those manufacturers who set a high power value for even minimal traffic, the vendors that dont would win out in the end. 2.1.4 Advertising Available POWER Please see section 2.1.8 for more information on why Available POWER plays a crucial role in determining the choice of routers based on the Power metric. 2.1.5 ECMP links It is possible that multiple links would have the same PWRLINK metric after a computation cycle. In such a case load-balancing techniques can be used to keep the ECMP links in a steady state with respect to each other. Depending on the Available Bandwidth thereafter it is possible that the ECMP links may no longer be Equal cost but UCMP or Unequal Cost Paths. 2.1.6 Dampening the side effects of constant change It is recommended in this draft that the implementation of the proposal be adaptive, infrequent in computation to the extent possible without sacrificing adapting to the dynamism and also reduce any frequent oscillations. The actual methods to adopt for this computation are outside the scope of this document. 2.1.7 Calculating power shortest paths in an Area Assume the following topology where A,B,C etc.. are routers and corresponding labelled edges with weights are the links. These weights are the current values of the PWRLINK attribute that has been flooded in the LSAs through the Area concerned. Assume B is the ABR for Area 1 and the routers A and C are the Area 0 core routers. The rest of the routers are assumed to be in Area 1. Once the power topology of the Area 1 has been calculated as shown below with the PWRLINK attributes being assigned to the links, Constrained shortest path can be run from the ABR to any of the other routers say H, E , X etc.. The CSPF algorithm takes the constraint in terms of the PWRLINK attributes along with other attributes to construct a power shortest path from say router B to other routers in Area 1. Shankar Raman et.al Expires November 5, 2013 [Page 12] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 0.5 (C) +----------------+ 0.5| / | | / | 0.05 V/ 0.1 0.03 0.2 V (A)--->(B)--->(D)--->(G)--->(H) | | | | 0.5| | 0.1 | V V +----------->(E)--->(X) 0.5 0.3 Once the path has been computed it is possible to use RSVP-TE to construct the power shortest path with the TE-LSP being instantiated with the labels appropriately placed in the routers on the power shortest path. In this topology, assume one would want to construct a path from B to X then the dotted path shows the path constructed and to be used by a set of flows or streams of packets belonging to multiple flows as seen fit by the router B. If the PWRLINK metrics change after due course of time then another power shortest path that possibly traverses the same path (if the SUM of PWRLINKs doesnt exceed any other path's metrics' SUM) or some other path would be constructed. Specifically this method makes use of traffic- engineering signalling protocols as the method to place the streams from point X to point Y (where X and Y are routers). 0.5 (C) +----------------+ 0.5| / | | / | 0.05 V/ 0.1 0.03 0.2 V (A)--->(B)...>(D)...>(G)...>(H) | | . | 0.5| . 0.1 | V V +----------->(E)--->(X) 0.5 0.3 2.1.8 Power profiles of Routers and Switches It has been experimented and from several sources found that there exist routers which have different power profiles. The power profile of a router is the curve of power consumption to available bandwidth. Mentioned below are a few of these prominent ones that have to be taken into consideration. The first profile that we will consider is the flattening curve. The power consumed to available bandwidth curve takes the shape of a Shankar Raman et.al Expires November 5, 2013 [Page 13] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 steep one initially and then tapers off to a plateau. The point at which it begins to give a delta-C (delta in Power Consumed) to delta- B (Available Bandwidth exhausted) is the inflection point that tapers off to a plateau. Here the delta-C/delta-B begins to slow down or decrease rapidly. The more the traffic that is added onto the device the lesser it draws power. ^ | P | . o | . w | . e | . r | . | . c | . o | . n | . s | . u.| . ------------------------------------> | Available Bandwidth exhausted The second profile that we will consider is the exponential curve. The power consumed to available bandwidth curve takes the shape of an ever increasing steep curve as shown below. Here the delta-C/delta-B begins to increase as more traffic is thrown onto it as the Available bandwidth exhausted increases. This power curve beyond a point is intolerable with respect to power guzzling. ^ | P | . o | . w | . e | . r | . | . c | . o | . n | . s | . u.| . ------------------------------------> | Available Bandwidth exhausted Shankar Raman et.al Expires November 5, 2013 [Page 14] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 The third profile that we will consider is a linear curve. In other words just a straight line. Here delta-C/delta-B is a constant. ^ | P | . o | . w | . e | . r | . | . c | . o | . n | . s | . u.| . ------------------------------------> | Available Bandwidth exhausted 2.1.8.1 Concave and Convex power curves Given that there are 3 kinds of major profiles in the router power consumption, what line would we like to pick. This is an important point when choosing the metric to pick the low power paths. (a) If the confrontation is between 2 first profile routers the lower of the 2 would be considered as shown below. The lower curve offers better power savings for each GB of bandwidth transported. ^ | P | . o | . w | . . e | . . r | . . | . . c | . . o | . . n | . . s | . . u.| . ------------------------------------> | Available Bandwidth exhausted Shankar Raman et.al Expires November 5, 2013 [Page 15] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 (b) If the confrontation is between 2 second profile routers the upper curve offers more power savings per GB of bandwidth. ^ | P | . . o | . . w | . . e | . . r | . . | . . c | . . o | . . n | . . s | . u.| . ------------------------------------> | Available Bandwidth exhausted (c) When the confrontation is between a first profile curve and a second profile curve, it would be optimal to pick (as shown below) the lower of the curves because it gives us lesser power consumed for every GB of traffic routed / switched. Here the exponential curve is the one that offers lesser amount of power consumed per GB of traffic is chosen. But when it gets to a point that the two curves intersect it would be more optimal to pick the tapering curve. Thus at the meeting point of the 2 curves the exponential curve becomes more costly and the tapering one gives us more GB for the power buck. Thus this switchover from one curve to the other (in other words from the exponential curve to the tapering one) does the trick in terms of finding an optimal solution. Shankar Raman et.al Expires November 5, 2013 [Page 16] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 ^ . | . P | . . o | (*) w | . . e | . . r | . . | . . c | . . o | . . n | . . s | . . u.| .. ------------------------------------> | Available Bandwidth exhausted (*) Metric switchover point from Consumed Power to Available Power. 2.1.8.2 Need to advertise both available power and consumed power Thus the above sections have shown that both the available power and the consumed power MUST be advertised so that case (c) can be deciphered and the switchover of the curves be done and the appropriate router be chosen for the rest of the bandwidth to be switched over to. Thus there will exist Consumed-Power to Available Bandwidth ratio and the Available Power to Available Bandwidth ratio. Both the ratios are computed and the lower value chosen. The Available Power can be judged from the calibration process such as the one carried out by independent test organizations as in [12]. An example of their calibration is referred to in [12]. Here given below is the formula for calculating the Available Power to Available Bandwidth ratio also called the Available POWER metric. Available POWER metric = Available Power consumed per XG (normalized bandwidth) port for a given ---------------------------------- Port on a LC Available Bandwidth on that port 2.1.9 Power to Available Bandwidth ratio in a TLV As per [RFC3630] the Link TLV can be used to carry this power to available Bandwidth ratio with an additional sub-TLV of the link TLV. The sub-type number 11 is recommended to be defined for this purpose. Shankar Raman et.al Expires November 5, 2013 [Page 17] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 [RFC 3630] states in section 2.2.1 and we QUOTE ... 2.1.10 Link TLV The Link TLV describes a single link. It is constructed of a set of sub-TLVs. There are no ordering requirements for the sub-TLVs. Only one Link TLV shall be carried in each LSA, allowing for fine granularity changes in topology. The Link TLV is type 2, and the length is variable. The following sub-TLVs of the Link TLV are defined: 1 - Link type (1 octet) 2 - Link ID (4 octets) 3 - Local interface IP address (4 octets) 4 - Remote interface IP address (4 octets) 5 - Traffic engineering metric (4 octets) 6 - Maximum bandwidth (4 octets) 7 - Maximum reservable bandwidth (4 octets) 8 - Unreserved bandwidth (32 octets) 9 - Administrative group (4 octets) 10 - Power-to-Multicast-replication-capacity (4 octets) 11 - Consumed-Power-to-Available-Bandwidth (4 octets) 12 - Available-Power-to-Available-Bandwidth (4 octets) This memo defines sub-Types 1 through 9. See the IANA Considerations in [RFC3630] section for allocation of new sub-Types. The Link Type and Link ID sub-TLVs are mandatory, i.e., must appear exactly once. All other sub-TLVs defined here may occur at most once. These restrictions need not apply to future sub-TLVs. Unrecognized sub-TLVs are ignored. Various values below use the (32 bit) IEEE Floating Point format. For quick reference, this format is as follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |S| Exponent | Fraction | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ S is the sign, Exponent is the exponent base 2 in "excess 127" notation, and Fraction is the mantissa - 1, with an implied binary point in front of it. Thus, the above represents the value: Shankar Raman et.al Expires November 5, 2013 [Page 18] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 (-1)**(S) * 2**(Exponent-127) * (1 + Fraction) It is proposed that we use the Power-to-Available-Bandwidth ratio as a 32 bit IEEE floating Point format field for the purpose of this document. Assume the following topology in a non-backbone area after calculating the PWR ratio in a given stage of the algorithm. 0.5 (C) +----------------+ 0.5| / | | / | 0.05 V/ 0.1 0.03 0.2 V (A)--->(B)--->(D)--->(G)--->(H) | | | | 0.5| | 0.1 | V V +----------->(E)--->(X) 0.5 0.3 Here (B) is a Area Border Router and has to ingress links into it from (C) and (A) which are in the backbone area. Connectivity within the backbone area are not shown here. Assume (C) and (A) are connected in some way with other routers in the backbone area. Routers (D), (G), (E), (H), (X) are routers in the non-backbone area. Routers (H), (E) and (X) are potential TE endpoints. The PWR metrics shown here on the edges within the area represent metrics for a specific TE endpoint. The metrics on edges (C)->(B) and (A)->(B) are for any traffic ingressing through (B) into the non-backbone area heading towards any TE endpoint (H), (E) or (X). The number of constraints is likely to be few and the most widely used constraints are TE metric, link groups and bandwidth. But no restriction is assumed on use of other constraints. Thus here we add the PWR metric of a link as an additional constraint. Once the ABR computes k-power-shortest-paths to every {Ai, Nj} it has topology information about, it advertises the k-power-shortest-paths as a reachability vector in a newly defined "TE Reachability Vector TLV". Consider an example network show below. TEh is head-end and TEt is tail-end of a TE path, ABR1 and ABR2 are area border routers. Shankar Raman et.al Expires November 5, 2013 [Page 19] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 TEh2---R2 R4-----TEt2 \ / \ / TEh1---R1----ABR1-----Rb1-----Rb2-----ABR2----R3----TEt1 Area 1 Backbone Area2 In this example, ABR1's TE Reachability vector TLV for area 1 and area 0 are given below. { ABR1, [>, >]} { ABR1, ABR2, [>, >]} Here the vector TLVs are arranged as per increasing PWR metric associated with each path. That is the summation of all PWR metrics of the links in the path is done and the vector TLVs are ordered in increasing order of PWR metric sums. So the lowest-cost-power path is listed first and so on. If the least cost power path is to be chosen then the path in the first TLV is chosen. Similarly ABR2's TE Reachability vector TLV for area 2 and area 0 are given below. { ABR2, [>, >]} { ABR2, ABR1, [>, >]} The first thing to be noted is that head-ends are also considered as TE-endpoints. Essentially this means any head-end or tail-end of a inter-area TE-LSP can be considered as tail-end or head-end respectively. Note that the reachability vector advertised by ABR1 also contains the reachability vector of ABR2. For example, if ABR2 is brought up first, then it is likely that ABR1 would only have the following as TE Reachability vector TLV for area 0 before ABR2 computes path to the TE endpoints in area 2. { ABR1, ABR2 } Note that TLV would only contain the aggregate of link attributes namely cost, bandwidth etc and most importantly the PWR metric as well but not the complete path of intermediate nodes. For example, may be a set of <2, admin-group-1|admin-group- 2, 1Gbps> (where the 1Gbps could be the minimum bw available along the path). The above example topology has only one path from ABRs to TE endpoints. The number of path info "k" may have a default value or can be configured by the operator on all nodes. 2.2 TE Path Head-end Operation When any TE application requests TE path to be setup to an endpoint Shankar Raman et.al Expires November 5, 2013 [Page 20] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 that is not present in the same area, the head-end scans the TE Reachability vector TLVs advertised by ABRs and selects the path using the contained in the vector TLVs. Here is an example with multiple paths in area 1, backbone and area 2 called Figure 2.0 TEh3----R5---ABR3----Rb3-----Rb4------ABR4----R6--TEt4 \ / ___/ \ ___/ \ / / \/ TEh2---R2---ABR5------Rb5--------ABR6---R4-----TEt2 / \ \____ /\___ / \ \ / \ TEh1---R1----ABR1-----Rb1-----Rb2-----ABR2----R3----TEt1 Area 1 Backbone Area2 In this topology in figure 2.0 taking the tail-ends represented in the diagram, it is noted that TEt4 is reachable via ABR4, ABR6 and ABR2 as well. The TE reachability TLVs advertised by ABR6 for area 2 would be multiple to each tail-end since there exist multiple paths to reach at least most of them in area 2 once a packet reaches any of the ABRs in area 2. Here again the least cost power shortest path is listed first and so on. { ABR6, [>, >, >, etc.. } For area 0 the TE reachability TLV would be { ABR6, ABR1, [>, >...]} { ABR6, ABR5, [>, >...]} { ABR6, ABR3, [>, >...]} For the sake of brevity we do not enumerate all path information possible as it would be quite extensive. It is possible that there may be already setup LSPs which are being used for transit traffic on the backbone or in other non-backbone areas. It is also feasible to advertize already set up LSPs in the path info; no additional TLV is required for that purpose. The case where this may be useful would be if such transport LSPs exist in the backbone area and there is a willingness to provide higher preference to these LSPs to carry transit LSPs over backbone. There can be selective suppression of advertisements to other areas Shankar Raman et.al Expires November 5, 2013 [Page 21] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 (backbone or non-backbone) of LSPs if these are existing LSPs setup along a path which are utilized to a greater degree. If underutilized with respect to the PWR metric a more favourable metric could be advertized to other areas. For example, backbone area transport LSPs will be advertized as transit LSPs which would provide connectivity to LSP sections lying in non-backbone areas and would be updated more frequently since they facilitate inter-Area TE. Once a path in the TLV has been used for reserving bandwidth for traffic over that path, then it is withdrawn from the advertisements so that it becomes unusable. Another path may be computed over the same path but with possibly a different PWR metric sum since it is possible that the traffic over that path could have changed the PWR metrices in the edges along that path. 2.2 Suppression of Frequent updates owing to fluctuation in power and bandwidth Using the power consumed and the bandwidth available as discrete quantities will result in frequent oscillations. Such a step would result will result in frequent re-computations of the shortest power paths. For the sake of suppression of such frequent updates, it is possible to handle the PWR metric as falling within reasonable intervals of thresholds. If the interval in which PWR metric lies is moved out of and another interval is reached then the update is sent out in the IGP-TE mechanism. Otherwise if the interval in which the PWR metric lies is not moved out of then the updates are not sent. Suitable thresholds can be arrived at after suitable calibration through tests. Routers may have step levels in which they increase power consumption when they additively are loaded with more large bandwidth consuming multicast or unicast streams. Calibrating these levels may be useful for implementing this scheme. It is possible that such calibrated thresholds can be used for advertising the PWRLINK ratios in the OSPF LSA advertisements. This would be useful for bringing down the frequency of updates or advertisements from a line-card about its PWRLINK ratio. When power consumption meanders within a certain given interval these ratios need not be re-advertised even if further unicast and/or multicast streams are added to it. The incentive is to recognize a linecard that does not drastically change power consumption even if large bandwidth streams are added onto it for forwarding and thus give it credit for its power optimal functioning. If a router tends to consume the highest level of power even when carrying low amounts of unicast and multicast streams on its line card, it would automatically have a poor ratio when compared to a Shankar Raman et.al Expires November 5, 2013 [Page 22] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 router that efficiently uses power when considering the Available Bandwidth being observed. The best case would be a low power consuming line-card or a router filled with such line cards that does not leave its power interval no matter how much ever capacity is sought to be used on it. But that would be an ideal condition but it is definitely an idealistic scenario towards which the router manufacturers should look at. 2.3 Advantages 1) The TE Reachability vector TLV contains the aggregate of all link attributes along with TE constraints and so the head-end of the TE path can explicitly select the ABR that connects the destination area even though it does not know the complete topology of the backbone area. 2) As the TE reachability vector contains only the aggregate attributes of k-power-shortest-paths, the flooding overhead to support the mechanism is limited. 3) Centralized path computation element is not required for supporting inter-area power-shortest-path TE. The additional overhead of computing k-power-shortest-paths on ABR can be solved by offloading the computation overhead to additional processor in multi- core platforms. Shankar Raman et.al Expires November 5, 2013 [Page 23] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 3 Security Considerations None. 4 IANA Considerations New TLV types for OSPF and IS-IS for the new TLVs that have been introduced need to be assigned. 5 References 5.1 Normative References 5.2 Informative References [1] G. Appenzeller, Sizing router buffers, Doctoral Thesis, Department of Electrical Engineering, Stanford University, 2005. [2] A. P. Bianzino, C. Chaudet, D. Rossi and J. L. Rougier, A survey of green networking research, IEEE Communications and Surveys Tutorials, preprint. [3] J. Baliga, K. Hinton and R. S. Tucker, Energy consumption of the internet, Proc. of joint international conference on optical internet, June 2007, pp. 1-3. [4] J. Chabarek, J. Sommers, P. Barford, C. Estan, D. Tsiang and S. Wright, Power awareness in network design and routing, Proc. of the IEEE INFOCOM 2008, April 2008, pp. 457-465. [5] B. Venkat et.al, Constructing disjoint and partially disjoint InterAS TE-LSPs, USPTO Patent 7751318, Cisco Systems, 2010. [6] M. Xia et. al., Greening the optical backbone network: A traffic engineering approach, IEEE ICC Proceedings, May 2010, pp. 1-5. [7] W. Lu and S. Sahni, Low-power TCAMs for very large forwarding tables, IEEE/ACM Transactions on Computer Networks, June 2010, vol. 18, no. 3, pp. 948-959. [8] B. Zhang, Routing Area Open Meeting, Proceedings of the IETF 81, Quebec, Canada, July 2011. Shankar Raman et.al Expires November 5, 2013 [Page 24] INTERNET DRAFT Building power shortest inter-Area TE-LSPs May 4, 2013 [9] M.J.S Raman, V.Balaji Venkat, G.Raina, Reducing Power consumption using the Border Gateway Protocol, IARIA conferences ENERGY 2012. [10] A.Cianfrani et al., An OSPF enhancement for energy saving in IP Networks, IEEE INFOCOM 2011 Workshop on Green Communications and Networking [11] Shankar Raman et al., draft-mjsraman-rtgwg-inter-as-psp-01.txt, Work in Progress, February 2012. Authors' Addresses Shankar Raman Department of Computer Science and Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: mjsraman@cse.iitm.ac.in Balaji Venkat Venkataswami Department of Electrical Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: balajivenkat299@gmail.com Prof.Gaurav Raina Department of Electrical Engineering IIT Madras Chennai - 600036 TamilNadu India EMail: gaurav@ee.iitm.ac.in Shankar Raman et.al Expires November 5, 2013 [Page 25]