Internet Engineering Task Force V. Grado Internet-Draft T. Tsou Intended status: Informational Huawei Technologies(USA)Expires: December14,15, 2011 N. So Verizon Communications Inc. June12,13, 2011 Virtual Resource Operationsand& Management in the Data Centerdraft-tsou-vrom-problem-statement-01draft-tsou-vrom-problem-statement-02 AbstractTheTThe dynamic allocation of computing resources on a massive scale through the use of virtual machinesrunning over a "hypervisor" layerto serve a large number of customers and applications simultaneously brings a number of benefits but also a number of challenges to data center operations. Such challenges range from acquiring the information needed to provision the physical servers, storage and networking elements, through accounting for resource and application usage at the user level.The Distributed Management Task Force (DMTF) has begun the work of developing the standards needed to supportIn particular, thiswork, but many tasks remain. Thisdocumentprovides a brief survey ofdescribes the problemspace, but focusses on the requirements for operationof operational and managementof network resources withinchallenges that virtualization brings in the (carrier) data centercomplexas an enabler of new technologies such as self- provisioning andbetween that complexelastic capacity and related benefits of consolidation, reduced total cost of ownership, and energy management. This document does not cover theusers.problem of address resolution in massive data centers. It does not cover technologies related to VDI either. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on December14,15, 2011. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 31.1. Requirements Language2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 32.3. Operational Challenges for Virtualization . . . . . . . . . . . 42.1. A More Detailed Look At the Hypervisor3.1. Unique Requirements from Virtualization . . . . . . . . . . 42.2.3.2. Operations and Management in a VirtualizedData CenterDC . .4 3. Real and Virtual Network Management in the Virtualized Data Center .. . . . . 5 4. VM Performance and Configuration Management Challenges . . . . 6 4.1. Performance Management Challenges in Virtualization . . . . 6 4.2. VM Configuration and Inventory Operational Challenges . . . 6 5. Operational Challenges in Services with Virtual Resources . . . 6 5.1. VM Migration Operational Challenges . . . . . .6 4. Conclusions. . . . . . 7 5.2. VPC Operational and Management Challenges . . . . . . . . . 7 6. Conclusion and Recommendation . . . . . . . . . . .6 5. Acknowledgements. . . . . . 8 7. Manageability Considerations . . . . . . . . . . . . . . . . .6 6. IANA8 8. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 9. IANA Considerations . .6 7. Security Considerations. . . . . . . . . . . . . . . . . . . .6 8. Informative References8 10. Acknowledgements . . . . . . . . . . . . . . . . . . . .6. . . 8 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . .68 1. Introduction There is currently a strong movement toward virtualization of data center resources, with the aim of improving physical resource utilization, reducing energy consumption as a result, and improving responsiveness to demands for data center resources. Along with this is a parallel movement toward outsourcing data center operations, with the result that multiple enterprises may share the same physical resources for their own computing and storage requirements. Both in- house and outsourced data center virtualization raise obvious concerns over data security and regulatory compliance, but this is just one aspect of the operational and management challenges raised by large-scale resource virtualization.TheIThe basic unit of resource virtualization in this architecture is the virtual machine (VM), running over a "hypervisor" layer and sharing a physical server with other virtual machines and a management entity. The virtual machine has its own guest operating system, set of one or more applications, and allocations of processing, storage, and networking resources. TheDistributed Management Task Force (DMTF) has provided a standard interface for management of the virtual machine life cycle, the Open Virtualization Format [OVF]. Within the data center complex, virtual machines may migrate from one set of physical resourcesneed toanother. The data center complex may itself be distributed geographically,mix andresources for a single virtual machine maymatch products from different vendors can lead to interoperability challenges that need to bespread over multiple locations.addressed by standards from the start, or risk vendor lock-in. Thisraisesdocument focuses on theimportanceproblem statement ofensuring adequate and well-running network resources within thevarious data centercomplex. The next section is a slightly more detailed description of the interaction between the hypervisor and thevirtualmachines it supports, followed by a general enumeration of the complete range ofresources operations and managementissues associated with massive virtualization within the data center complex. The following section looks in more detail atareas. This document does not cover the problem ofoperating and managing the virtual and physical networking resources within the complex, with the aim of laying the groundwork for identifying gapsaddress resolution in massive data centers nor theexisting setproblem ofstandards in this area. The concluding section actually identifies those gaps. 1.1. Requirements Language This document contains no normative language.technologies known as VDI. 2. Terminology CE: Customer Edge DC: Data Center DE: Data Center Edge PE: Provider Edge VDC: Virtualized Data Center VDI: Virtualized Desktop Infrastructure VPC: Virtual Private Clouds (or VPN based Clouds) VM: Virtual Machine (or host) SLA: Service Level Agreement 3. Operational Challenges for Virtualization2.1. A More Detailed Look At the Hypervisor With3.1. Unique Requirements from Virtualization There are operational challenges and requirements ensuing from virtualized resources that are unique and not present in conventional implementations. In virtualized resources, a virtual machine (VM) embodies virtual hardware that is emulated by a hypervisor (or a similar mechanismwith respect toin virtual networkingresources. The hypervisorresources), that mediates all interactions with the underlying physical hardware. That mechanism is transparent to the guest operating system,which runsrunning completelyindependentlyindependent of otherVMs sharingguests VM in the same physical resources. The hypervisor then performs the mapping between the virtual resources of the VM (usually an application and a guest operating system) and the physical hardware of a server, storage, or network. The hypervisor is the component responsible for managing physical resources to allocate them fairly to the multiple VMs running on a host. The main physical resource pools that the hypervisor needs to manageto carryfor carrying out its job are as follows: o CPU:AAn configurable amount of CPU assigned to a VM, during creation, regardless of the real amount of physical CPU.The hypervisor uses aA CPU scheduler is used by the hypervisor to process the CPU requests from the VMs. o Disk: A single large file allocated on one the host's datastores as a virtual disk for each VM. Disk I/O requests are also queued for each VM. o Memory: A fixed amount of memory that gets mapped into virtual memory pages and in turn to physical memory pages. The hypervisor must ensure there is no overallocation of virtual memory that the physical memory cannot handle. o Network:The virtual machineIt includes a virtual network to provide the same functionality as a physical network, including IP address, virtual NIC, switches and firewalls.SomeBecause the network trafficpassesis only handled betweenVMs on the same host, and will not be visibleVMs, many times there is no visibility to external physical tools.2.2.3.2. Operations and Management in a VirtualizedData Center [PTT] We can add material to expand this, but bear in mind that it is just an introductory section and need not get too detailed.DC From thebrief description givenabove,one can infera number of operational challengesthatarise in a virtualized environmenttothat cover differentlevelsleves of service in adata center.DC. Some of challenges are: 1.Impact of newNew devices andelements:elements *monitoring of theMonitor VMlife cycle,lifecycle, including VM migration ("liftand shift");& shift") *addressAddress management for VMlife cycle support;lifecycle support *resourceResource monitoring for faults and abnormalconditions;conditions *metering of resourceResource availability,performancepeformance metrics andusage;usage based metering *monitoring of theHypervisor statusof the hypervisorandtheinterfaceto it.monitoring 2. Infrastructure managementsupport:support *connectivityConnectivity needs for virtualizationmanagement;management *policyPolicy management andenforcement;enforcement in the Virtualized DC * IPFIX for virtualization performancemanagement;management *interoperabilityInteroperability of multiplehypervisors;hypervisors *openOpen programmatic interfaces to support access and management ofdata centerDatacenter contents and resources 3.Enabling service management:Service management enablement *supportingSupporting secure low-latency VLAN and VPN connections in large scale onanon-demand (pay as you go) basis for capacity management of dedicatedpoolspool of resources *scalable serviceService hosting,collocation,co-location, and distributed virtualized redundancy for seamless scaling *facilitiesFacilities management including premises, security,privacy,privacy and data integrity management for regulatorycompliance;compliancy *managementManagement ofvirtual private data centers,VPN-baseddata centers. 3. Realclouds 4. VM Performance andVirtual NetworkConfiguration Management Challenges 4.1. Performance Management Challenges in Virtualization From theVirtualized Data Center [PTT] Timediscussion in the previous sections, it ispressing, so I'll propose textclear that performance management forthis later, unless someone elsevirtualized resources is a very critical area. This cando it. Basically we have to monitor at three levels: virtualinclude capacity management, as well as availability management, since they provide a status of the health of the networkconnections (rely onresources and services, including thehypervisor for that), physical connections withinhealth of the(possibly distributed) data center,VMs and hypervisors. While a hypervisor is in charge of load balancing and keeping tab of resource utilization, additional mechanisms need to be in place to obtain thecustomer connectionsbest performance. There is a need to obtain key metrics from the VMs that can be used to support a more robust management of resources and services. A protocol such as IPFIX that can tap into VMs to obtain flow metrics needs to be devised (or existing ones enhanced). The source of thedata center. 4. Conclusions [PTT] Juergen?? You'd know what tools exist nowmetrics for virtualized resources are diferent from thejobsources in physical resources, as described above. Metrics such as uptime that are provided by mechanisms within IPPM andwhat needs development. 5. Acknowledgements Tom Taylor added textPMOL recommendations need to be obtained for virtualized hosts as well. 4.2. VM Configuration and Inventory Operational Challenges Another critical challenge arising from the creation of virtual hosts is 'sprawl' that can happen over time when there is lack of control and monitoring in the lifecycle of a large quantity of VMs. Besides service and performance problems that might arise, configuration management issues will ensue from VM that are consuming resources in the background, if unmonitored, some may becomean author unless itout of sync with policy and compliance, with fine-tuning applications, with bandwith management, etc. There isnecessarya need toleave roomcarry out configuration management forothers. 6. IANA Considerations This memo includes no request to IANA. 7. Security Considerations Securityvirtual resources and hosts that include discovery, inventory and backup, for the both the virtual and physical resources. There is avery important consideration, bothneed forprivatea protocol like NETCONF that also covers virtual hosts. 5. Operational Challenges in Services with Virtual Resources 5.1. VM Migration Operational Challenges VM migration, also called by other names such as VM Motion, "lift & shift", etc., implies moving a VM to another location within a data center, or even to a different data center, with the consequent operational and management challenges. Just to name a few of the challenges: o Policy reconfiguration in the destination device o Other dynamic information updating in destination o Address management and reconfiguration when involving different data centers In addition, there are a few complex models for the interconnection of service providers supporting virtualized resources already working their way into real implementations that will allow more complex VM migration schemes but that also represent their own set of operational and management challenges. 5.2. VPC Operational andmulti-userManagement Challenges From the virtualized resources deployment models, this model brings together most of the operational requirements into the unified computing stack, and in particular the network side, directly. VPC embodies services that are delivered over a virtual private network (VPN) and therefore the VPN protocols and implementations need to be enhanced to support the "characteristics" mentioned above in addition to their own operational requirements. At a higher level, a VPC needs to meet SLA and enfore policies, meet demands from the management of order requests, self-provisioning, usage-based metering, and management, either through programmatic interfaces or by other means. There are two cases, 1) the pure play virtualization-enabled provider, that needs to use another carrier to interconnect the different datacenters. However, detailed discussioncenters, and 2) the carrier offering over their own network. In both of those cases, the VPN protocols need operational enhancements to support end-to-end SLA monitoring or even just internal service level objectives in addition to customer SLA, all which requires corresponding metrics, based on usage per resource and per customer. In the second case, there are additional improvements that can be made as in the case of the deployment of a DC Edge device (DE), a box in between the DC and the network edge thattopicwill simplify provisioning and operations by eliminating the need of a CE-PE pair. 6. Conclusion and Recommendation With the new networking, server and storage technologies converging in to the DC in the form of unified computing solutions at whose core isouta virtualization stack, many new operational and management challenges arise. Therefore, we recommend that the IETF engage in the study of thescopeproblem ofthis document.virtualized resources operations and management and, if appropriate, the development of interoperable solutions. 7. Manageability Considerations This document does not add additional manageability considerations. 8. Security Considerations To come. 9. IANA Considerations This memoraisescurrently includes nosecurity issues in itself. 8. Informative References [OVF] Distributed Management Task Force (DMTF), "Open Virtualization Format (OVF)", January 2010, <http://dmtf.org/standards/ovf>.request to IANA. 10. Acknowledgements Awaiting comments. Authors' Addresses Victor M. Grado Huawei Technologies(USA)2330 CentralExpwy,Expwy SantaClara,,Clara, CA 95050USAUS Phone: Email: vgrado@huawei.com Tina Tsou Huawei Technologies(USA)2330 CentralExpwy,Expwy SantaClara,,Clara, CA 95050USAUS Phone: Email: tena@huawei.com Ning So Verizon Communications Inc. 2400 N. GlenvilleAve, Richardson,,Ave Richardson, TX 75080USAUS Phone: Email: ning.so@verizonbusiness.com