idnits 2.17.1 draft-lee-teas-cso-use-cases-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 63 instances of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 83 has weird spacing: '...es, to acces...' == Line 92 has weird spacing: '... impact on t...' == Line 97 has weird spacing: '... termed as C...' == Line 128 has weird spacing: '...) such as b...' == Line 129 has weird spacing: '...n, and manag...' == (13 more instances...) -- The document date (October 30, 2017) is 2370 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 4 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Teas Working Group Y. Lee 2 Internet Draft Huawei 3 Intended status: Informational 4 Expires April 30, 2018 L. M. Contreras 5 Telefonica 7 Carlos J. Bernardos 8 U3CM 10 H. Xu 11 China Telecom 13 October 30, 2017 15 Use Cases for Cross-Stratum Optimization 17 draft-lee-teas-cso-use-cases-00 19 Abstract 21 This draft provides use-cases and requirements for cross-stratum 22 optimization. 24 Status of this Memo 26 This Internet-Draft is submitted to IETF in full conformance with 27 the provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as Internet- 32 Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six 35 months and may be updated, replaced, or obsoleted by other documents 36 at any time. It is inappropriate to use Internet-Drafts as 37 reference material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/ietf/1id-abstracts.txt 41 The list of Internet-Draft Shadow Directories can be accessed at 42 http://www.ietf.org/shadow.html. 44 This Internet-Draft will expire on April 30, 2018. 46 Copyright Notice 48 Copyright (c) 2017 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with 56 respect to this document. Code Components extracted from this 57 document must include Simplified BSD License text as described in 58 Section 4.e of the Trust Legal Provisions and are provided without 59 warranty as described in the Simplified BSD License. 61 Table of Contents 63 1 Introduction...................................................3 64 1.1 Scope and Objectives.....................................3 65 1.2 Common Terms, Abbreviations and Definitions..............3 66 2 Use Cases......................................................4 67 2.1 Game Server Application..................................4 68 2.2 Automatic assignment of ICT resources to meet SLAs of App 69 Orchestrator...................................................7 70 2.2.1 ICT Auto-Scaling Monitoring........................7 71 2.2.2 ICT Auto-Scaling Reservation.......................8 72 2.3 Hybrid Cloud.............................................8 73 2.4 Virtual CDN.............................................12 74 3 Summary and Conclusions ......................................13 75 4 References ...................................................13 76 5 Contributors .................................................14 77 Authors' Addresses...............................................14 79 1 Introduction 81 1.1 Scope and Objectives 82 Distributed computing environments allow end-users, from individual to 83 enterprises, to access to large pools of storage resources, 84 computational resources and various application services (e.g., Video 85 Caching, Virtual Machine mobility, media content delivery, IoT, etc.). 86 Data centers provide the physical and virtual infrastructure in which 87 applications and services are provided. 89 Since the data centers used to provide application services are 90 distributed geographically around a network (or a set of interconnected 91 networks), application service instantiation can have a significant 92 impact on the state of the network resources. Conversely the 93 capabilities and current state of the network can have a major impact 94 on the application performance. 96 This draft is aimed to provide end-to-end orchestration, which is 97 termed as Cross-Stratum Optimization (CSO) across Application 98 orchestration, Data Center SDN orchestration, and WAN SDN 99 orchestration so that applications can be created seamlessly and 100 optimally for operators and their customers. 102 This document provides a set of use cases for Application-Driven Cross 103 Stratum Orchestration, mainly provided by operators. 105 1.2 Common Terms, Abbreviations and Definitions 106 CSO: It corresponds to Cross Stratum Optimization or Cross Stratum 107 Optimizer, depending on the context. Due to historical reasons, it can 108 also be expanded as Cross Stratum Orchestration/Orchestrator. 110 Application Stratum: It is the functional grouping which encompasses 111 application resources and the control and management of these 112 resources. These application resources are used along with network 113 services to provide an application service to clients/end-users. 114 Application resources are non-network resources critical to 115 achieving the application service functionality. Examples of 116 application resources include: caches, mirrors, application specific 117 servers, content, large data sets, and computing power. Application 118 service is a networked application offered to a variety of clients 119 (e.g., server backup, VM migration, video cache, virtual network on- 120 demand, 5G network slicing, etc.). The entity responsible for 121 application stratum control and management of its resources is 122 referred to as application orchestrator. 124 Network Stratum: It is the functional grouping which encompasses 125 network resources and the control and management of these resources 126 providing transport of data between clients/end-users and application 127 sources. Network resources are resources of any layer 3 or below 128 (L0/L1/L2/L3) such as bandwidth, links, paths, path processing 129 (creation, deletion, and management), network databases, path 130 computation, admission control, and resource reservation. In some 131 cases, network resources may include L4 service functionality such as 132 firewall, load balancing, etc. as part of path computation constraints. 133 There are different types of network stratum 134 controllers/orchestrators. 136 ICT: It refers to Information and Communication Technology. 138 Orchestration: The ongoing selection and use of resources by a server 139 to satisfy client demands according to optimization criteria (as 140 defined in [SDN-Arch]. 142 SDN Controller: The SDN controller is at the heart of the SDN 143 architecture. It is the intelligent entity that controls resources to 144 deliver services. Its core function is the real-time multi-dimensional 145 convergence of a changing resource environment and a changing service 146 demand environment toward an optimum, where the optimization criteria 147 may also change in time as defined in [SDN-Arch]. 149 2 Use Cases 150 This section provides a set of CSO-related use cases and their 151 requirements, mainly provided by operators. 153 2.1 Game Server Application 154 Online gaming business is one of the fastest growing areas in the ICT 155 market around the world, breaking down the barriers between nations to 156 increase the number of multi-national game users. Gaming traffic is 157 generated from a huge number of players' interactions during game 158 sessions that can vary dynamically in terms of time or geographic 159 scale. In addition, due to the nature of real-time online gaming 160 characterized as a time critical service, game users are very sensitive 161 to some ICT parameters such as server response time, delay, jitter, 162 and synchronization time between users. Therefore, it is desirable 163 that the game service provider has its servers located close to the 164 game users in order to guarantee quality of service by using 165 distributed data centers for fast and reliable networking. This is one 166 driver that is causing Provider ICT resources to move from data centers 167 to areas in access networks such as a Telco DSLAM or a cable headend. 169 Three terms will be used to describe this use case: ICT resource, ICT 170 provider, and Game service provider. The ICT resource refers to storage, 171 compute, and networking resources across WAN. The ICT provider is a 172 CSO operator that provides ICT resources for its clients including 173 game service providers. The Game service provider, for example an App 174 owner, is a client of the CSO operator to consume ICT resources from 175 the ICT provider. 177 The Game service provider that uses the public cloud from an ICT 178 provider to build out its game infrastructure will commonly lease 179 adaptive ICT resource to save operational costs. However, the current 180 static or manual resource allocation cannot meet the dynamic time- 181 varying demands with unexpected changing traffic and access patterns 182 from users. As a result, most of the game service providers need a 183 solution to dynamically configure their ICT resources according to 184 status of resource usage such as the number of active users, server 185 and storage load, and network performance. It is necessary that the 186 ICT provider monitor those leased ICT resources for status, and report 187 the information to the game service provider for on-demand resource 188 control. 190 If the game service provider wants to expand the existing ICT 191 infrastructure across multi-nations for its global game business, 192 there is one of two options as follows. Firstly, the game service 193 provider could make direct contract with each of the multiple ICT 194 providers located in other countries to purchase ICT services. However, 195 that requires time, effort and coordination for each ICT provider, not 196 only to explore business relationships with new ICT providers, but 197 also to have multiple different API interfaces for access to ICT 198 resources from multiple ICT providers. Secondly, the game service 199 provider can ask for dedicated ICT service from a primary ICT provider 200 with which it has already an established business relationship for its 201 existing ICT infrastructure. The second option enables the game service 202 provider to receive full ICT services from a delegated ICT provider on 203 behalf of other ICT providers, reducing operational complexity by 204 eliminating multiple API interfaces from many ICT providers. Generally, 205 the delegated operator can purchase network services for its customers 206 at a wholesale price from other network operators, which is more 207 economical than having individual small and medium game service 208 providers purchase it directly from them. The delegated ICT provides 209 customers with the delegated network service at a reasonable price 210 while being profitable. Therefore, the delegated ICT provider is 211 beneficial to both the delegated operator and the game service provider. 212 When we consider current international leased line service mostly 213 provided manually by the delegated network operator, it is natural 214 that the game service provider would also choose the second option 215 because of its convenience and business economy reasons. 217 The main high-level requirements of this use case include: 219 . Shared information between the delegated CSO and the sub- 220 contracted CSO before the delegated ICT service is requested 222 - Directory information of ICT resources, available ICT resource, 223 policy (such as price) etc. 225 . Federation of CSOs to reserve ICT resources including computer, 226 storage, and network 228 - Sequential control from the delegated CSO to the sub-contracted 229 CSO to reserve ICT resources 231 - Reservation parameters: user ID, computing power, amount of 232 storage, network bandwidth, and customized fault and 233 performance parameters to be reported to each App Owner, etc. 235 . Customized status report of assigned ICT resources to each App 236 Orchestrator 238 - Fault parameters: failures of server, network (link & node), 239 and storage 241 - Performance parameters: server load, storage load, network 242 bandwidth load, etc. 244 . Dynamic control of ICT resources resulting from the reported status 245 data 247 - Recovery (restoration and protection) of failures according to 248 SLAs 250 2.2 Automatic assignment of ICT resources to meet SLAs of App 251 Orchestrator 253 2.2.1 ICT Auto-Scaling Monitoring 254 Some network services like gaming and CDN have rapidly time-varying 255 traffic patterns, making it difficult to estimate traffic levels in 256 order to reserve ICT resources. Therefore, The Application 257 orchestrator that leases ICT resources from CSOs can easily 258 oversubscribe ICT resources in order to provide services such as QoS. 259 If the Application Orchestrator adaptively leases its ICT resources 260 from the CSO to optimize resource usage for cost savings, it will incur 261 significant overhead to monitor real-time status of its ICT resources 262 to achieve this control, resulting in raising OPEX costs. Therefore, 263 some ICT customers may want to avoid the costs of the management of 264 these ICT resources, and will use the CSO to perform this task on their 265 behalf. This use case requires an ICT auto-scaling (i.e., self- 266 organizing) function to automatically scale in and out ICT resources 267 according to the SLA. 269 The implementation of the ICT auto-scaling function should be combined 270 with several sub-functions such as monitoring network resource usage 271 in real-time, analyzing optimal resource levels, and then adjusting 272 those levels of resource usage. For example, the Application 273 orchestrator can initiate the ICT auto-scaling service to the CSO with 274 a server load level that passes a threshold value to increase the 275 number of servers. Though the CSO does not receive any request to 276 increase capacity of the server resources from the Application 277 orchestrator, it automatically increases the number of servers to lower 278 the resource load of the servers when it reaches the server load 279 threshold. 281 This use case can also be applicable to the delegated service described 282 in the previous section. Receiving a request of the ICT auto-scaling 283 service from the Application orchestrator, the delegated CSO may 284 request the service to a subcontracted CSO to provide complete auto- 285 scaling service over whole leased ICT resources. After receiving the 286 request, the subcontracted CSO automatically controls the ICT 287 according to status of the ICT resources assigned. 289 The main high-level requirements of this use case include: 291 . Auto-scaling policy negotiated between the Application 292 orchestrator and the delegated CSO, or between the delegated CSO 293 and the subcontracted CSO 295 . Create, read, update and delete the dedicated resources as needed 296 (including network, compute and storage) requested by the 297 application orchestrator 298 . Analytics function to determine when auto-scaling policy should be 299 deployed 300 . Resource usage report including current usage and billing change 301 information 302 . Performance and fault management of the assigned resources 304 2.2.2 ICT Auto-Scaling Reservation 305 A further example offers high quality forwarding service through CSO. 306 An Application Orchestrator can submit a forwarding guarantee request 307 to a CSO if it finds there have been some problems on the forwarding 308 path such as packet loss or unacceptable time delay. 310 This request includes the start and end points of the path, bandwidth 311 demand, and time delay requirements (actually there may be dozens of 312 start points and end points when we offer service from several data 313 centers to dozens of nodes on the WAN network), then the request will 314 be sent to a WAN Controller and operated by a path computation element 315 (PCE). 317 The main high-level requirements of this use case include: 319 . APIs between CSO/WAN controllers of MPLS TE, such as LSP design 320 and stream Steering. 321 . WAN controllers should support : Open flow/ BGP FlowSpec /path 322 computation element (PCE) 324 2.3 Hybrid Cloud 325 Hybrid cloud combines the use of both public cloud and private clouds 326 and is the main development direction of cloud computing services 327 today. Because of security and privacy considerations, enterprise 328 customers generally prefer to store critical data and core business 329 transactions in their private cloud facilities when adding public cloud 330 computing resources. They use the public cloud to run the other non- 331 core business and non-critical transactions for an on-demand resource 332 delivery mode to reduce the overall cost of resources and add the 333 flexibility they need. 335 Hybrid cloud is not just a simple addition of private cloud and public 336 clouds, and it always has the following three features: 338 (1)Unified resource view: A hybrid cloud needs to have a unified 339 service portal which includes a unified monitoring interface of 340 all resources being used. This is a unified display of the 341 customer resources located in both the public cloud and private 342 cloud. 343 (2) Unified management of resources: A hybrid cloud should handle 344 the life-cycle management of all resources deployed in both the 345 private cloud and public clouds through the unified portal 346 described above. All customer resources are presented, searched 347 and monitored at the portal as well as unifying resource 348 application and billing processes. 349 (3)Unified inter cloud networking: in hybrid cloud scenarios, 350 customers always lease Virtual Private Cloud (VPC) resources in 351 a public cloud, and then connect the resources to their own 352 private cloud. It requires an interconnection between private 353 cloud and VPC in public cloud, and it needs to choose an 354 appropriate networking solution. 355 With the support and cooperation of the cloud management platform, the 356 unified view and management of the hybrid cloud can be implemented by 357 calling an open API from the management platform of the hybrid cloud 358 to the public cloud. Currently, unified networking has become the key 359 requirement of a Hybrid Cloud. For different business demands, there 360 are different inter cloud networking solutions for a hybrid cloud. For 361 example, if hybrid cloud customers want to use the public cloud for 362 data backup, then it only needs a layer 3 connection between the 363 private and public clouds; on the other hand, if the customers want to 364 achieve a virtual machine resource expansion or virtual machine 365 migration, then it needs a layer 2 connection. Hybrid cloud networking 366 based on layer 2 connections is more challenging today. 368 The traditional networking solutions of a Hybrid Cloud always use VPN 369 and leased line technologies to establish a network connection between 370 private and public clouds. For example, the direct connect service 371 launched by Amazon Web Services (AWS) is a networking solution between 372 the customer private cloud and AWS public cloud. Essentially, direct 373 connect is a leased line service that can support hourly billing, so 374 it requires the operator partners of Amazon to provide the support of 375 network connections. Comparing the two techniques, VPN is more mature, 376 easier to configure, and has a lower cost than a leased line service. 377 However, VPN has some disadvantages: it has relatively lower 378 performance and availability than a leased line service and its 379 implementation depends on the underlying physical network, which is 380 difficult to guarantee QoS in a consistent manner. Compared to a VPN, 381 leased line has high performance and availability, but it requires 382 customers to pay a higher cost, and its configuration is not flexible. 383 Therefore, neither the VPN nor leased line is unable to fully meet 384 networking requirements in hybrid cloud scenarios. 386 For hybrid cloud services provided by operators, introducing SDN to 387 build and manage connection between private cloud and public cloud is 388 valuable. It can realize a coordinated scheduling among private cloud, 389 public cloud and inter cloud networking under the drive of business, 390 and solves the problems faced by the traditional networking 391 technologies. 393 The hybrid cloud resources orchestrator schedules cloud and network 394 resources by calling the management platform API of the private cloud 395 and public cloud and the north bound interface of the inter cloud 396 networking controller. Among them, the status of all cloud and network 397 resources can be displayed and managed. When the hybrid cloud business 398 needs a network connection between private cloud and public cloud, the 399 request will be sent to orchestrator, and the orchestrator can then 400 drive the controller to establish an on-demand inter cloud connection 401 between private cloud and public cloud. 403 Currently, some operators, such as China Telecom, are actively 404 developing hybrid cloud services based on SDN. The core idea is 405 developing a hybrid cloud resource orchestrator base on the OpenStack 406 cloud management platform, to achieve unified management and display 407 customer private cloud and OpenStack public cloud resources. To 408 simplify the implementation, the orchestrator is developed based on 409 the OpenStack-based private cloud management platform. Meanwhile, the 410 orchestrator will adapt to multi-vendor SDN network solutions, to build 411 network connection between the private and public clouds on demand. 413 SDN-based hybrid cloud solutions focus on the following: 415 (1) Hybrid cloud resource orchestrator 416 A Hybrid cloud resource orchestrator can be developed based on 417 OpenStack, and it can support all of the hybrid cloud resources to 418 be displayed, managed and connected on-demand. OpenStack is a 419 mainstream open source cloud management platform technology, with 420 comprehensive cloud resource management capabilities. The hybrid 421 cloud resource orchestrator drives cloud and network resources by 422 calling restful APIs of OpenStack and the SDN controller, but there 423 are problems need to be solved, namely: 425 . Unified authentication: The hybrid cloud resource orchestrator 426 can simultaneously display and manage private and public cloud 427 resources, which will need a Security Assertion Markup Language 428 (SAML) 2.0-based authentication mechanism. This mechanism in a 429 cloud federation environment will first enable trust between 430 private cloud and public cloud interfaces, and then support inter 431 cloud resources access. 432 . Network Mapping: The orchestrator needs to access the network 433 segment identification of private cloud, public cloud and inter- 434 cloud network connections, and then do a mapping of those network 435 segments IDs to build an end-to-end connection, which establishes 436 a network resource information library. 438 (2) Inter-cloud SDN solution 439 In a hybrid cloud network, the gateway and WAN devices are controlled 440 by a corresponding SDN controller(s). Compared to other network 441 scenarios, an operator's network is complex. For example, there are 442 multiple technologies, vendors, models and other aspects that 443 present challenges in building efficient network connections between 444 clouds. In order to deal with this situation, hybrid cloud networking 445 requires adapting various SDN network programs, such as adapting 446 specific areas and vendor-specific controllers. 448 The hybrid cloud resource orchestrator drives SDN solutions by 449 calling restful APIs of the SDN controller(s). In order to achieve 450 interoperability between the cloud's internal network and the inter- 451 cloud SDN network, a restful API should minimally include the 452 following information: NETWORK_TYPE, PHYSICAL_NETWORK and 453 SEGMENTATION_ID. This information will be provided to the 454 orchestrator by the inter-cloud SDN network controller(s). 456 (3) IDC SDN Controller 457 In order to achieve communication between the Intra Data Center (IDC) 458 internal network segment (including private and public clouds) and 459 the external inter-cloud SDN network segment, the network controller 460 within the IDC should provide the necessary network information to 461 the orchestrator. Therefore, it needs to provide the restful north- 462 bound API as an Inter-cloud SDN controller. 464 In addition, the IDC SDN Controller is responsible for the 465 configuration and deployment of the cloud network in the VPC resource 466 pool of the public cloud, and should provide an NBI to the hybrid 467 cloud resource orchestrator. 469 (4) Heterogeneous resource management 470 Currently, VMware resources are widely used in the enterprise private 471 cloud. For implementing heterogeneous resources management, the 472 OpenStack-based hybrid cloud orchestrator should resolve the 473 problems as to how OpenStack can manage VMware resources by calling 474 VMware open APIs and it relies on the joint efforts of VMware and 475 the OpenStack community. 477 From the research and practice based on SDN Hybrid Cloud done by 478 China Telecom, it can be seen that the system architecture has great 479 similarity with the CSO project, which proves the rationality and 480 feasibility of the CSO architecture. 482 The main high-level requirements of this use case include: 484 . APIs between CSO controller/hybrid cloud resource orchestrator and 485 management platform of private cloud and public cloud for hybrid 486 cloud resource unified management and display. 487 . Southbound interface model of CSO controller/hybrid cloud resource 488 orchestrator for adapting various SDN solutions use to connect 489 private cloud and public cloud on demand. 490 . Workflow design of CSO controller/hybrid cloud resource orchestrator 491 for hybrid cloud management and operation, which includes: user 492 authentication, resource allocation, connection establishment, and 493 application deployment. 494 . End to end system architecture design to meet carrier grade service 495 requirements, such as high performance, high availability, high 496 scalability and high interoperability. 498 2.4 Virtual CDN 499 CDN providers can be willing to deploy virtualized CDN (vCDN) end 500 points internally to the facilities of network providers in order to 501 improve the experience perceived by end customers when accessing cached 502 content. 504 The CDN application will interact with the Network Provider 505 Orchestrator (as CSO Orchestrator in this case) in order to get access 506 to both DC and network resources and/or capabilities for the mentioned 507 functionalities. The CDN application will be required access to the 508 virtual CDN end point in order to handle the virtual cache. Such access 509 could be indirect (via the Network Provider Orchestrator itself) or 510 direct (having access to the DC Controller), getting access to 511 management interfaces that can permit remote management and control of 512 the virtual cache functionality for the consistency of the CDN service 513 end-to-end. 515 This situation necessarily requires agreement between parties, which 516 introduces as main high-level requirements: 518 . Deployment of specific virtualized capabilities for traffic 519 distribution in the form of specialized network functions, i.e. 520 virtual cache, to be deployed in DCs of the network providers 521 . Configuration of circuits needed for feeding and connecting the 522 virtual caches to the origin server (in CDN provider's network) 523 . Configuration of QoS, SLAs, etc., as guaranteed capabilities to 524 ensure proper service offering 525 . Mechanisms for scaling-in and/or -out according to changing demand 526 dynamics 527 . Virtual cache relocation for the same reasons, or even for making 528 some content closer to the final user. 530 Summary and Conclusions 532 In this document, we have discussed a set of use-cases to which the 533 CSO concept is well applied. A set of requirements are identified for 534 each use-case. From an implementation standpoint, these requirements 535 will be the basis for data modeling and protocol design for the CSO 536 interfaces identified in this document. 538 References 540 [SDN-Arch] SDN Architecture, Issue 1.1, 2016, ONF TR-521. 542 Contributors 544 Authors' Addresses 546 Young Lee 547 Huawei Technologies 549 Email: leeyoung@huawei.com 551 L. M. Contreras 552 Telefonica 554 Email: luismiguel.contrerasmurillo@telefonica.com 556 Carlos J. Bernardos 557 UC3M 559 Email: cjbc@it.uc3m.es 561 Honglei Xu 562 China Telecom 564 Email: xuhl.bri@chinatelecom.cn