idnits 2.17.1 draft-lee-network-stratum-query-problem-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 20, 2011) is 4754 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'ITU-T Y.1541' is mentioned on line 407, but not defined -- Obsolete informational reference (is this intentional?): RFC 2261 (Obsoleted by RFC 2271) -- Obsolete informational reference (is this intentional?): RFC 2265 (Obsoleted by RFC 2275) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Young Lee (Huawei) 2 Internet Draft Dave McDysan (Verizon) 3 Intended Status: Informational Ning So (UTD) 4 Greg Bernstein (Grotto) 5 Tae Yeon Kim (ETRI) 6 Kohei Shiomoto (NTT) 7 Oscar Gonzalez de Dios (Telefonica) 9 April 20, 2011 11 Problem Statement for Network Stratum Query 13 draft-lee-network-stratum-query-problem-02.txt 15 Status of this Memo 17 This Internet-Draft is submitted to IETF in full conformance with the 18 provisions of BCP 78 and BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on April 20, 2011. 38 Copyright Notice 40 Copyright (c) 2010 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. 50 Abstract 52 This document describes the general problem of network stratum query 53 for application optimization. Network Stratum query is an ability to 54 query the network from an application controller such as those used 55 in Data Centers so that application controller decisions such as 56 server assignment or virtual machine instantiation/migration could be 57 performed with better knowledge of the underlying network conditions. 59 As application servers are distributed geographically across Data 60 Centers, many application-related decisions such as which server to 61 assign a new client or where to instantiate/migrate virtual machines 62 will suffer from sub-optimality unless the underlying network 63 conditions are factored in the decision process. The lack of network 64 awareness may result in not meeting the end-user service objective 65 for some key applications like video gaming/conferencing that require 66 stringent latency and bandwidth requirement. 68 Table of Contents 70 1. Introduction......................................... 2 71 2. Network Contexts..................................... 4 72 3. Problem Statement .................................... 7 73 4. High-level requirements................................ 9 74 4.1. Data Center-Network Stratum Communication (NS Query) Error! 75 Bookmark not defined. 76 4.1.1. Application Profile........................... 9 77 4.1.2. Network Load Data to be queried................ 10 78 4.1.3. Responses to NS Query from network to application. 10 79 5. Security Considerations............................... 11 80 6. References......................................... 11 81 Author's Addresses..................................... 13 82 Intellectual Property Statement .......................... 13 83 Disclaimer of Validity.................................. 14 85 1. Introduction 87 Cross Stratum Optimization is a joint optimization effort in 88 allocating resources to end-users that involves both the Application 89 Stratum and Network Stratum. 91 The application stratum is the functional block which manages and 92 controls application resources and provides application resources to 93 a variety of clients/end-users. Application resources are non-network 94 resources critical to achieving the application service 95 functionality. Examples include: application specific servers, 96 storage, content, large data sets, and computing power. Data Centers 97 are regarded as tangible realization of the application stratum 98 architecture. 100 The network stratum is the functional block which manages and 101 controls network resources and provides transport of data between 102 clients/end-users to and among application resources. Network 103 Resources are resources of any layer 3 or below (L1/L2/L3) such as 104 bandwidth, links, paths, path processing (creation, deletion, and 105 management), network databases, path computation, admission control, 106 and resource reservation capability. 108 Application services offered by Data Centers by their very nature 109 utilize application resources (e.g., servers, storage, memory, 110 etc...) in Data Centers, and the underlying network resources 111 provided by LANs, MANs, and carrier's transport networks. 113 As the application servers are distributed geographically across many 114 Data Centers, decisions such as server assignment or new virtual 115 machine instantiation/migration will suffer from sub-optimality 116 unless the underlying network conditions are factored in the decision 117 process. The lack of network awareness may result in not meeting the 118 end-user service objective for some key applications like video 119 gaming/conferencing that require stringent latency and bandwidth 120 requirement. 122 This document describes the general problem of network stratum query 123 (NS Query) in Data Center environments. Network Stratum query is an 124 ability to query the network from application controller in Data 125 Centers so that application server assignment or virtual machine 126 instantiation/migration decision would be jointly performed based on 127 both the application resource/load status and the network 128 resource/load status. 130 The NS query is different from typical "horizontal" query 131 capabilities in the network. The horizontal query in the network is 132 carried by the head end (i.e., data source) that would "probe" the 133 network to test the capabilities for data flows to/from particular 134 point in the network. This is a horizontal scheme. 136 NS Query is a two-stage query that consists of two stages: 138 . A vertical query capability where an external point (i.e., the 139 Application Control Gateway (ACG) in Data Center) will query 140 the network (i.e., the Network Control Gateway (NCG)); and 142 . A horizontal query capability where the NCG to gather the 143 collective information of a variety of horizontal schemes 144 (IPPM, IGP, RIB, etc.) implemented in the network stratum. 146 NS Query does not re-invent the wheel on existing network 147 capabilities but tries to reuse them where possible. 149 2. Network Contexts 151 Figure 1 shows a typical data center architecture where an end-user 152 (the point of consuming resource) needs to be connected for its 153 application (e.g., gaming) to a server located in one of the data 154 centers geographically spread. 156 --------------- 157 ---------- | DC 1 | 158 | End-user |. . . . .>| o o o | 159 | | | \|/ | 160 ---------- | O | 161 | ----- --|------ 162 | | 163 | | 164 | -----------------|----------- 165 | / | \ 166 | / ..........O PE1 \ -------------- 167 | | . | | o o o DC 2 | 168 | | PE4 . PE2 | | \|/ | 169 ----|---O.........................O---|---|---O | 170 | . | | | 171 | . PE3 | -------------- 172 \ ..........O Carrier / 173 \ | Network / 174 ---------------|------------- 175 | 176 --------|------ 177 | O | 178 | /|\ | 179 | o o o | 180 | DC 3 | 181 --------------- 183 Figure 1. Data Center Architecture 185 Figure 1 shows that the user application can be served by any of the 186 servers in DC1, DC2 or DC3. When the initial request arrives to the 187 proxy server in DC1, the proxy server (aka, the load balancer) would 188 ideally assign an "optimal" server based on both server resource/load 189 status and the network resources/load status. This server assignment 190 decision today, however, is limited due to the lack of network 191 awareness in this decision making process in the application. 193 For example, the server close to the user in Data Center 1 may find a 194 good server that can serve the application. Assume that this 195 particular application requires x amount of minimum bandwidth 196 guarantee and with less than y ms of latency limit. The route that 197 serves Data Center 1 traffic to the end-user (PE1 - PE4) may not have 198 enough capacity at a moment of service instantiation and therefore 199 the service objective of the end-user may not be satisfied had such 200 route been taken. 202 On the other hand, there may be good servers available in Data 203 Centers 2 and 3 and their routes (PE2-PE4 and PE3-PE4) may have 204 enough capacity to meet the service requirement. 206 This example illustrates the benefit of and the need for the joint 207 optimization across the application and network strata. NS Query is 208 the ability to query the network from an application to collect a 209 certain level of network information. No such mechanisms exist in the 210 today's Internet Protocol technologies. 212 Figure 2 shows the context of NS Query in a more detail within the 213 overarching data center architecture shown in Figure 1. 215 -------------------------------------------- 216 | Application Overlay | 217 | (Data Centers) | 218 | | 219 ---------- | -------------- -------------- | 220 | End-User | | | Application |. . . .| Application | | 221 | |. . . >| | Control | | Processes | | 222 ---------- | | Gateway (ACG)| -------------- | 223 | | | -------------- | 224 | ------------- . . . . | Application | | 225 | /\ | Related Data | | 226 | || -------------- | 227 ----------||-------------------------------- 228 || 229 || Network Stratum Query (First Stage) 230 || 231 ----------||-------------------------------- 232 | \/ Network Underlay | 233 | | 234 | -------------- ---------------- | 235 | | Network |. . . | Network | | 236 | | Control | | Processes | | 237 | | Gateway (NCG)| ---------------- 238 | | | ---------------- | 239 | ------------- | Network | | 240 | |------------->| Related Data | | 241 | (Second Stage) ---------------- | 242 ------------------------------------------- 244 Figure 2. NS Query Architecture 246 Figure 2 shows key architectural components that enable NS Query 247 capability. The Application Control Gateway (ACG) is the proxy 248 gateway that interfaces with network and generate queries to network. 249 The ACG can query various metric values that may contribute to 250 meeting the overall service objective of an application. This is a 251 vertical query (Stage 1). 253 In the network stratum, the Network Control Gateway (NCG) serves as 254 the proxy gateway to the network. The NCG receives the query request 255 from the ACG, probes the network to test the capabilities for data 256 flow to/from particular point in the network, and gather the 257 collective information of a variety of horizontal schemes (IPPM, 258 IGP, MIB, TED, etc.) implemented in the network stratum. This is a 259 horizontal query (Stage 2). 261 Further, the NCG provides the responses to the original query sent 262 from the ACG. The data collected by the NCG needs to be abstracted. 263 This abstraction is needed on two grounds. 265 First, the network does not usually reveal its details to the 266 outside entity. Although the Data Center providers and the carriers 267 are business partners in providing application services to the end- 268 users and to the application providers (e.g., gaming providers), 269 detail network data may not be leaked to the Data Centers, and vice 270 versa. 272 Secondly, detail network data may not be understood by the 273 application. Link or node level data in and of themselves may not 274 help the application to process the detail data. For instance, 275 latency or bandwidth on a link level is too detail for application 276 to handle. Instead, latency or bandwidth on a route level (i.e., PE1 277 - PE4 in Figure 1) will help the application make its server 278 selection/instantiation decision. 280 The abstraction function needs to be provided by the NCG. Note that 281 NCG plays a head end role within the network probing/collecting 282 network performance/management data (e.g., IPPM, MIB, etc.) or 283 routing data [MRT] (e.g., LSDB, TED, BGP-RIB, etc.) and others. Once 284 the basic data is collected, the NCG will need to abstract/summary 285 before it sends to the application. 287 3. Problem Statement 289 3.1. Limitation of existing probing schemes 291 The current state-of-the art probing schemes from an external point 292 are based on ping or trace route like mechanisms based on the 293 assumption that the underlying transport network is L3 network and 294 that the routing is simple IP forwarding. 296 In reality, the carrier's routing schemes are likely to include IP 297 tunneling or MPLS tunneling on top of or in place of IP forwarding. 298 In some cases, the actual network may be VPN, MPLS-TE or GMPLS-TE 299 networks where trace route does not work. 301 This implies that network status estimation technique made from 302 application stratum cannot be accurate. Thus, application resource 303 allocation to end-users can suffer sub-optimality and fail to meet 304 performance objective for the application. 306 3.2. Lack of vertical query schemes 308 Currently, the query in the network is carried by the head end (i.e., 309 data source) that would "probe" the network to test the capabilities 310 for data flows to/from particular point in the network. This is a 311 horizontal scheme. 313 There is no standard "vertical" query scheme that allows an 314 application control gateway in Data Center to query network stratum 315 in a way suitable for a third party (i.e. an entity "outside" the 316 network). 318 Due to the lack of standard vertical query scheme, there is a 319 limitation on exchanging information between application and network 320 that would increase efficiency of joint optimization across 321 application to network. For instance, the ability to exchange the 322 application profile information (defined in Section 4.1) or network 323 capability information between application and network would increase 324 efficiency of resource allocation across application to network. 326 3.3. Limitation of SNMP MIB network monitoring techniques 328 SNMP MIB monitoring techniques as defined in [RFC2261] and [RFC2265] 329 do not provide mechanisms to guarantee synchronization of the data 330 collection. This higher level of synchronization is necessary to 331 service: a) application with stringent QoS and Bandwidth, or to b) 332 better schedule massive quantities of small data flows. 334 In addition, SNMP MIB Network Monitoring lacks a whole network query 335 capability. A whole network query is a query to gather information 336 across many boxes simultaneously under the control of a single 337 administration domain (AD) as defined in RFC 1136. A single AD means 338 the single AS or multiple ASes under the control of a single AD. 340 3.4. Lack of abstraction mechanisms 342 Most of the information needed to provide NS Query is currently 343 available from the network; however, it is not aggregated into a form 344 suitable for use by the application stratum. For example from 345 commonly monitored SNMP based link statistics and current routing 346 tables one can easily compute average available bandwidth and many 347 other statistical performance measures such as packet loss, latency, 348 etc. 350 However, neither the raw SNMP nor routing table data should be 351 delivered to the application stratum since (a) this reveals too much 352 information concerning the carriers network, (b) presents too much 353 information to transfer to each application. This warrants some works 354 on abstraction from network side to preserve the privacy of network 355 stratum details from the application stratum. 357 4. High-level requirements 359 This section discusses high-level requirements to support NS Query in 360 the Data Center environments. 362 The ACG plays the key role functioning as an application gateway to 363 network and runs the NS Query. The ACG has access to the end-user 364 profile for the application and the candidate servers' locations 365 locally and remotely located. How the ACG access these information is 366 beyond the scope of this work. 368 4.1. Application Profile 370 The application Stratum needs to provide the application profile to 371 network. 373 Example service profile information that can be useful to network to 374 understand is as follows: 376 . End user IP address; 378 . User access router IP address; 380 . Authentication Profile: Authentication Key; 382 . Bandwidth Profile: Minimum bandwidth required for the 383 application; 385 . Connectivity Profile: P-P, P-MP, Anycast (Multi-destination); 387 . Directionality of the connectivity: unidirectional, bi- 388 directional; 390 . Path Estimation Objective Function: Min latency, etc. 392 Additional profile information can be added depending on the network 393 capability. 395 4.2. Network Load Data to be queried (First Satge) 397 For a given location mapping information (i.e., from the server 398 location to end-user location), the query from an application can ask 399 the following network load data: 401 . Type of networks and the technical capabilities of the networks; 402 . Bandwidth capabilities and availability; 403 . latency; 404 . jitter; 405 . packet loss; 406 . And other Network Performance Objective (NPO) as defined in 407 section 5 of [ITU-T Y.1541]. 409 Note that this can be asked in a different way. For example, the 410 query can simply ask: 412 . Can you give me a route with x amount of b/w (from server to 413 end-user) within y ms of latency? 414 . Can you give me a route with x amount of b/w (from server to 415 end-user) with no packet loss? 417 4.3. A Whole Network Query capability (Second Stage) 419 Upon the request from application (specifically, the ACG in Figure 420 2), the network (specifically the NCG in Figure 2) should perform "a 421 whole network query" of information. 423 A whole network query is a query to gather information across many 424 boxes simultaneously under the control of a single administration 425 domain (AD) as defined in RFC 1136. A single AD means the single AS 426 or multiple ASes under the control of a single AD. 428 The scope of a whole network query can include the topology of the 429 network, the bandwidth availability for the routes of interest, the 430 capabilities and congestion of links and routes, and an indication of 431 the contribution to delay and jitter that each link and route will 432 contribute and so on. 434 4.4. Data Synchronization Mechanism 436 The ability to capture the data at the same instant should be 437 provided. 439 4.5. Responses to NS Query from network to application 441 Given the network query from application, the network should provide 442 the following mechanisms: 444 - For a given location mapping information from application (i.e., 445 from the server location to end-user location) and the gathered 446 information by the second stage query discussed in section 4.3., 447 the network needs to present the requested information in a 448 standard format and respond to the application. 450 The actual abstraction mechanism is beyond the scope of this 451 document. 453 5. Security Considerations 455 TBD 457 6. IANA Considerations 459 This informational document does not make any requests for IANA 460 action. 462 7. References 464 7.1. Informative References 466 [RFC2261] D. Harrington, et al., "An Architecture for Describing SNMP 467 Management Frameworks," January, 1998. 469 [RFC2265] B. Wijnen, et al., "View-based Access Control Model (VACM) 470 for the Simple Network Management Protocol (SNMP)," 471 January, 1998. 473 [Y.2011] General principles and general reference model for Next 474 Generation Networks, October, 2004. 476 [Y.2012] Functional Requirements and architecture of the NGN, April, 477 2010. 479 [MRT] L. Blunk, M. Karir, and C. Labovitz, "MRT routing 480 information export format," draft-ietf-grow-mrt, work in 481 progress. 483 Author's Addresses 485 Young Lee 486 Huawei Technologies 487 1700 Alma Drive, Suite 500 488 Plano, TX 75075 489 USA 490 Phone: (972) 509-5599 491 Email: ylee@huawei.com 493 Ning So 494 Univerity of Texas at Dallas 495 Email: ningso@yahoo.com 497 Dave McDysan 498 Verizon Business 499 Email: dave.mcdysan@verizon.com 501 Greg M. Bernstein 502 Grotto Networking 503 Fremont California, USA 504 Phone: (510) 573-2237 505 Email: gregb@grotto-networking.com 507 Tae Yeon Kim 508 ETRI 509 tykim@etri.or.kr 511 Kohei Shiomoto 512 NTT 513 Email : shiomoto.kohei@lab.ntt.co.jp 515 Oscar Gonzalez de Dios 516 Telefonica 517 Email : ogondio@tid.es 519 Intellectual Property Statement 521 The IETF Trust takes no position regarding the validity or scope of 522 any Intellectual Property Rights or other rights that might be 523 claimed to pertain to the implementation or use of the technology 524 described in any IETF Document or the extent to which any license 525 under such rights might or might not be available; nor does it 526 represent that it has made any independent effort to identify any 527 such rights. 529 Copies of Intellectual Property disclosures made to the IETF 530 Secretariat and any assurances of licenses to be made available, or 531 the result of an attempt made to obtain a general license or 532 permission for the use of such proprietary rights by implementers or 533 users of this specification can be obtained from the IETF on-line IPR 534 repository at http://www.ietf.org/ipr 536 The IETF invites any interested party to bring to its attention any 537 copyrights, patents or patent applications, or other proprietary 538 rights that may cover technology that may be required to implement 539 any standard or specification contained in an IETF Document. Please 540 address the information to the IETF at ietf-ipr@ietf.org. 542 Disclaimer of Validity 544 All IETF Documents and the information contained therein are provided 545 on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 546 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 547 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 548 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 549 WARRANTY THAT THE USE OF THE INFORMATION THEREIN WILL NOT INFRINGE 550 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 551 FOR A PARTICULAR PURPOSE. 553 Acknowledgment 555 Funding for the RFC Editor function is currently provided by the 556 Internet Society.