idnits 2.17.1 draft-balaji-panet-dc-label-semantic-for-pwr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (February 17, 2013) is 4086 days in the past. Is this intentional? Checking references for intended status: None ---------------------------------------------------------------------------- -- Missing reference section? 'RFC2119' on line 138 looks like a reference Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Balaji Venkat Venkataswami 3 Intended Status: Experimental RFC Bhargav Bhikkaji 4 Expires: August 2013 DELL 5 Shankar Raman 6 IIT Madras 7 February 17, 2013 9 Load-balancing to Data Centers in a L3VPN environment based on Power 10 draft-balaji-panet-dc-label-semantic-for-pwr-00 12 Abstract 14 Data Centers may be spread across different locations for a 15 particular enterprise. Different locations may mean within the same 16 country but across different geographical locations, or outside the 17 country even in a different continent. These data centers may be 18 serving the enterprise or multiple enterprises / tenants wherein the 19 regular enterprise site may request data from a data center site 20 which could be one of the data center sites proximal to the 21 enterprise site. Proximity is usually calculated based on a metric 22 that is bandwidth driven or in terms with regard to the number of 23 hops to reach that data center site hence bringing into play delay 24 characteristics. Assume a topology where the data center sites and 25 the enterprise sites are MPLS based L3VPN sites that are being 26 provided connectivity through a Service Provider deploying Layer 3 27 VPNs. Given such a topology it is possible that replication of data 28 happens across the data centers in a timely manner to keep the data 29 active and refreshed across all data center sites. Suitable 30 mechanisms for such replication will come into play for this purpose. 31 Thus any of the data centers can cater to the request from a user 32 site. 34 It is possible that power consumption in each data center may vary 35 according to the load on each data center. It would be prudent to 36 introduce a scheme where the power metric coupled with other metrics 37 such as bandwidth and delay be used by a Provider Edge router in a 38 L3VPN scenario to direct the packets or requests from regular user 39 sites to such data centers with the least such metric. This is in 40 line with the follow-the-moon strategy of directing requests for data 41 and compute to data centers which are power-wise more efficient 42 during the night or during the day. This draft document lays out one 43 such proposal. 45 Status of this Memo 46 This Internet-Draft is submitted to IETF in full conformance with the 47 provisions of BCP 78 and BCP 79. 49 Internet-Drafts are working documents of the Internet Engineering 50 Task Force (IETF), its areas, and its working groups. Note that 51 other groups may also distribute working documents as 52 Internet-Drafts. 54 Internet-Drafts are draft documents valid for a maximum of six months 55 and may be updated, replaced, or obsoleted by other documents at any 56 time. It is inappropriate to use Internet-Drafts as reference 57 material or to cite them other than as "work in progress." 59 The list of current Internet-Drafts can be accessed at 60 http://www.ietf.org/1id-abstracts.html 62 The list of Internet-Draft Shadow Directories can be accessed at 63 http://www.ietf.org/shadow.html 65 Copyright and License Notice 67 Copyright (c) 2013 IETF Trust and the persons identified as the 68 document authors. All rights reserved. 70 This document is subject to BCP 78 and the IETF Trust's Legal 71 Provisions Relating to IETF Documents 72 (http://trustee.ietf.org/license-info) in effect on the date of 73 publication of this document. Please review these documents 74 carefully, as they describe your rights and restrictions with respect 75 to this document. Code Components extracted from this document must 76 include Simplified BSD License text as described in Section 4.e of 77 the Trust Legal Provisions and are provided without warranty as 78 described in the Simplified BSD License. 80 Table of Contents 82 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 83 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 84 2.0 Reference Topology . . . . . . . . . . . . . . . . . . . . . 4 85 2.1 Methodology of the new scheme . . . . . . . . . . . . . . . 5 86 2.2 Calculation of the Compound Power Metric . . . . . . . . . . 8 87 2.2.1 Power metric calculation for the DC as a whole and per 88 tenant . . . . . . . . . . . . . . . . . . . . . . . . . 9 89 2.3 Extending the labeling scheme to the CE . . . . . . . . . . 9 90 2.4 Extensions to MP-iBGP for this scheme. . . . . . . . . . . . 9 92 3 Security Considerations . . . . . . . . . . . . . . . . . . . . 10 93 4 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 10 94 5 References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 95 5.1 Normative References . . . . . . . . . . . . . . . . . . . 10 96 5.2 Informative References . . . . . . . . . . . . . . . . . . 10 97 APPENDIX - A : References for power saving related material . . . 10 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 100 1 Introduction 102 Data Centers may be spread across different locations for a 103 particular enterprise. Different locations may mean within the same 104 country but across different geographical locations, or outside the 105 country even in a different continent. These data centers may be 106 serving the enterprise or multiple enterprises / tenants wherein the 107 regular enterprise site may request data from a data center site 108 which could be one of the data center sites proximal to the 109 enterprise site. Proximity is usually calculated based on a metric 110 that is bandwidth driven or in terms with regard to the number of 111 hops to reach that data center site hence bringing into play delay 112 characteristics. Assume a topology where the data center sites and 113 the enterprise sites are MPLS based L3VPN sites that are being 114 provided connectivity through a Service Provider deploying Layer 3 115 VPNs. Given such a topology it is possible that replication of data 116 happens across the data centers in a timely manner to keep the data 117 active and refreshed across all data center sites. Suitable 118 mechanisms for such replication will come into play for this purpose. 119 Thus any such data center site can cater to a request from a user 120 site. 122 It is possible that power consumption in each data center may vary 123 according to the load on each data center. It would be prudent to 124 introduce a scheme where the power metric coupled with other metrics 125 such as bandwidth and delay be used by a Provider Edge router in a 126 L3VPN scenario to direct the packets or requests from regular user 127 sites to such data centers with the least such metric. This is in 128 line with the follow-the-moon strategy of directing requests for data 129 and compute to data centers which are power-wise more efficient 130 during the night or during the day. This draft document lays out one 131 such proposal. 133 1.1 Terminology 135 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 136 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 137 document are to be interpreted as described in RFC 2119 [RFC2119]. 139 2.0 Reference Topology 141 Assume the following topology where Data Center 1 and 2 are 142 geographically dispersed within the county or across different 143 continents and there is sufficient desirable properties of delay and 144 bandwidth allocated to each of them. End User Sites depicted below 145 the CEs would be enterprise sites that would request data and compute 146 from these Data center sites. The DC GWs connecting the Data centers 147 would play the role of CEs for the Data Centers. The PEs (1-5) are 148 the Provider Edge routers of the ISP Core network that provide 149 regular MPLS based L3VPN services for the inter-connection of the 150 Data Center sites with the End User sites. 152 During the normal course of events VPN instance labels would be 153 exchanged between the PEs and the ISP core would provide LDP or RSVP 154 based connectivity amongst these PEs. Thus a stack of labels with 155 inner VPN instance label with the outer label being LDP or RSVP would 156 be used to direct traffic from End User Sites to the Data Center 157 sites and even amongst the End User Sites themselves. It is also 158 possible that replication services would run between the Data Center 159 sites as well using this mechanism. 161 __________ ,---------. 162 / ,' `. 163 ;Data Center) ( Data Center ) 164 ( 2 ' `. 1 ,' 165 +-----------+ `-+------+' 166 \ / 167 +--+--+ +-+---+ 168 |DC GW| |DC GW| 169 +-+---+ +-----+ 170 | | 171 PE5 PE4 172 .--. .--. 173 ( ' '.--. 174 .-.' ISP Core ' 175 ( network ) 176 ( .'-' 177 '--'._.'. )PE3 178 PE1 '--' \ \ 179 / / PE2 \ \ 180 / / | \ \ 181 +---+--+ +------+ +--+----+ 182 | CE1 | | CE2 | | CE3 | 183 +-+--`.+ +-+----+ +-+--+--+ 184 __/_ \ \__ 185 '--------' '--------' '--------' 186 :End User: :End User: :End User: 187 : Site : : Site : : Site : 188 '--------' '--------' '--------' 190 Figure 1 : A Generic Architecture for Multiple Data Centers providing 191 Data and Compute services for End user sites. 193 2.1 Methodology of the new scheme 194 When a site on the enterprise connected through a L3 VPN intends to 195 access a data center, load-balancing mechanisms may use the nearest 196 data center or that within the nearest country or even in another 197 continent depending on the availability and load being serviced. The 198 method proposed in this document would advertise a power consumption 199 rating along with a L3VPN service instance label for that Data Center 200 site and the enterprise site PE accessing the Data Center may arrive 201 at a decision as to which DC it is to access based on the power 202 consumption rating so advertised in the MP-iBGP update. Trust level 203 between Provider and customer is advised in this case. 205 The PE devices would receive different VPN instance labels from each 206 of the Data Centers using a MP-iBGP update with the compound power 207 based metric in the attribute information in the update. Suitable 208 extensions to extended community attributes would be done to 209 facilitate the passage of such information. Assume Label 100 is sent 210 from PE4 to the enterprise sites with Compound power metric 1200 and 211 Label 200 is sent from the PE5 to the enterprise sites with Compound 212 power metric 1400 at a certain point in time. 214 When the End User sites request a service from a data center, the PE 215 (1-3) which have the labels 100 and 200 with their respective 216 compound power metrices 1200 and 1400 respectively will choose which 217 Data Center (1 or 2) has the least compound metric and direct the 218 services towards that PE connected to the data center with that least 219 power metric. 221 __________ ,---------. 222 / ,' `. 223 ;Data Center) ( Data Center ) 224 ( 2 ' `. 1 ,' 225 +-----------+ `-+------+' 226 pwr = 1400 \ / pwr = 1200 227 +--+--+ +-+---+ 228 |DC GW| |DC GW| 229 +-+---+ +-----+ 230 | | 231 (200, pwr = 1400) PE5 PE4 (100, pwr = 1200) 232 .--. .--. 233 ( ' '.--. 234 .-.' ISP Core ' 235 ( network ) 236 ( .'-' 237 '--'._.'. )PE3 238 PE1 '--' \ \ 239 / / PE2 \ \ 240 / / | \ \ 241 +---+--+ +---+--+ +--+----+ 242 | CE1 | | CE2 | | CE3 | 243 +-+--`.+ +-+----+ +-+--+--+ 244 __/_ \ \__ 245 '--------' '--------' '--------' 246 :End User: :End User: :End User: 247 : Site : : Site : : Site : 248 '--------' '--------' '--------' 250 Figure 2: Control plane exchanges between PE5, PE4 and PEs (1-3) 252 In this case Data Center 1 has the least Compound power metric 1200 253 and hence the traffic from say CE1 would be sent to Data Center 1. 254 There is a periodic calculation of the compound power metric in each 255 of the Data Center sites and this is exchanged with the PE through 256 eBGP (say for example) between the respective CEs and the PEs. The 257 VPN instance label used would be 100 to get to the Data Center 1 and 258 the reachable PE would be set to PE4. 260 __________ ,---------. 261 / ,' `. 262 ;Data Center) ( Data Center ) 263 ( 2 ' `. 1 ,' 264 +-----------+ `-+------+' 265 pwr = 1400 \ / pwr = 1200 266 +--+--+ +-+---+ 267 |DC GW| |DC GW| 268 +-+---+ +-----+ 269 | | 270 PE5 PE4 271 .--. .--. 272 ( ' '.--. 273 .-.' ISP Core ' 274 ( network ) 275 ( .'-' 276 '--'._.'. )PE3 277 (LDP Label, 100) PE1 '--' \ \ 278 / / PE2 \ \ 279 / / | \ \ 280 +---+--+ +---+--+ +--+----+ 281 | CE1 | | CE2 | | CE3 | 282 +-+--`.+ +-+----+ +-+--+--+ 283 __/_ \ \__ 284 '--------' '--------' '--------' 285 :End User: :End User: :End User: 286 : Site : : Site : : Site : 287 '--------' '--------' '--------' 289 Figure 3: Data Plane path from say CE1 to PE4 based on Power Metric 290 for DC. 292 Here the power metric is not discrete but in intervals of thresholds. 293 Only the threshold interval is thus exchanged between the CEs and the 294 PEs. This way dampening frequent fluctuations or oscillations within 295 a given power metric interval is taken care of. 297 Inter-DC connectivity and replication may also benefit from this 298 scheme. DC A would choose to replicate with DC B at a time when the 299 power consumption rating in both sites or in DC B is the lowest. 301 2.2 Calculation of the Compound Power Metric 303 Factors such as compute load, cooling power consumption and network 304 bandwidth within the Data Center would be used to compute the 305 compound power metric within a Data Center to be advertised to the 306 PEs within the Layer 3 VPN. The actual calculation is out of scope of 307 this document and there exists sufficient literature to suggest a 308 suitable method for such calculation. For now, this draft proposes a 309 scheme for exchanging such power metrices and using them for load- 310 balancing within a L3VPN scenario. 312 2.2.1 Power metric calculation for the DC as a whole and per tenant 314 Many schemes exist for calculating the power consumed per tenant 315 based on the occupancy of each tenant in a DC and for calculating the 316 power consumed by the DC as a whole. Appropriate labels can be 317 exchanged per tenant with the respective power metrices factored in 318 per tenant if necessary. The specific end user site that is using the 319 data and compute power of a data center may belong to a particular 320 tenant and may wish to direct its traffic to the data center based on 321 the power consumed for that specific tenant Identifier and hence use 322 the appropriate label for the same in the PE. If the Data Center 323 cater to a single tenant and are owned by the tenant then the overall 324 power consumed by the DC will be used in the MP-iBGP update. 326 2.3 Extending the labeling scheme to the CE 328 It is possible to extend the label imposition from the CE itself 329 towards the ISP core in order to ensure that the CE can be made aware 330 of such a scheme being available and use appropriate labels to 331 indicate to the PEs that the CE requires power metric based load- 332 balancing. Two different labels could be used by the CEs one for 333 conventional methods of requesting services and the other the power 334 metric based method where the PEs would consult the power metrices 335 available and direct the request towards the low power consuming data 336 center. 338 If trust levels are not to be adhered to the label may be propogated 339 along with the power consumption ratio to the CE and the CE would 340 make the appropriate decision. 342 2.4 Extensions to MP-iBGP for this scheme. 344 A future version of the draft will outline the actual extensions to 345 the BGP protocol and its attributes with regard to how the compound 346 power metric is carried in the actual BGP exchanges. 348 3 Security Considerations 350 Trust levels between the Provider Edge and the customer edge should 351 be proper in order that the Data Center's power metrices are 352 exchanged between the PE and CE. eBGP as a PE-CE protocol could be 353 adhered to for this purpose. Appropriate security mechanisms would 354 have to be taken into account if the Data Center is serving multiple 355 tenants. The computed Compound Power Metric may be calculated for 356 each tenant and mechanisms should be adopted that one tenant's 357 compound metric is not shared with other tenants. Appropriate label 358 exchanges with each tenant's Label information and corresponding 359 power metrices should be done with such separation in mind. If there 360 is a collated power metric for all the tenants put together then the 361 PE device should make sure that other Data Center provider's 362 information is held separately in it's tables. 364 4 IANA Considerations 366 Suitable IANA considerations for extending the BGP extended community 367 attribute for accommodating the power metric information in the MP- 368 iBGP update are to be taken into account. This will be made more 369 clear in subsequent versions of the document. 371 5 References 373 5.1 Normative References 375 Please see Appendix A. 377 5.2 Informative References 379 Please see Appendix A. 381 APPENDIX - A : References for power saving related material 383 M. Zhang, J. Dong, B. Zhang, "Use Cases for Power-Aware 384 Networks", draft-zhang-panet-use-cases (work in progress) 386 B. Nordman, K. Christensen, "Nanogrids", draft-nordman- 387 nanogrids-00 (work in progress) 389 T. Suzuki, T. Tarui, "Requirements for an Energy-Efficient 390 Network System", draft-suzuki-eens-requirements (work in 391 progress) 392 Z. Cao, "Synchronization Layer: an Implementation Method 393 for Energy Efficient Sensor Stack", draft-cao-lwig-syn- 394 layer (work in progress) 396 A. Junior, R. Sofia, "Energy-awareness metrics global 397 applicability guideline", draft-ajunior-energy-awareness- 398 00 (work in progress) 400 B. Zhang, J. Shi, M. Zhang, J. Dong, "Power-aware Routing 401 and Traffic Engineering: Requirements, Approaches, and 402 Issues", draft-zhang- greennet (work in progress) 404 T. Suganuma, N. Nakamura, S. Izumi, H. Tsunoda, M. 405 Matsuda, K. Ohta, "Green Usage Monitoring Information 406 Base", draft-suganuma-greenmib (work in progress) 408 S. Raman, B. V. Venkataswami, G. Raina, V. Srini, "Power 409 Based Topologies and TE-Shortest Power Paths in OSPF", 410 draft-mjsraman- rtgwg-ospf-power-topo-01 (work in 411 progress) 413 S. Raman, B. V. Venkataswami, G. Raina, V. Srini, 414 "Building power optimal Multicast Trees", draft-mjsraman- 415 rtgwg-pim-power-02 (work in progress) 417 S. Raman, B. V. Venkataswami, G. Raina, "Reducing Power 418 Consumption using BGP", draft-mjsraman-rtgwg-inter-as-psp- 419 03 (work in progress) 421 S. Raman, B. V. Venkataswami, G. Raina, "Building power 422 shortest inter-Area TE LSPs using pre-computed paths", 423 draft-mjsraman-rtgwg- intra-as-psp-te-leak-02 (work in 424 progress) 426 S. Raman, B. V. Venkataswami, G. Raina, V. Srini, 427 "Reducing Power Consumption using BGP path selection", 428 draft-mjsraman-rtgwg-bgp- power-path-02 (work in progress) 430 Authors' Addresses 432 Balaji Venkat Venkataswami 433 DELL 434 Plot #1 SIDCO Estate 435 Olympia Tech Park 436 Guindy 437 Chennai 438 India 440 EMail: balaji_venkat_venkat@dell.com 442 Bhargav Bhikkaji 443 DELL 444 350 Holger Way 445 San Jose, CA 446 U.S.A 448 Email: Bhargav_Bhikkaji@dell.com 450 Shankar Raman 451 Department of Computer Science and Engineering 452 IIT Madras 453 Chennai - 600036 454 TamilNadu 455 India 457 EMail: mjsraman@cse.iitm.ac.in