| < draft-tsou-vrom-problem-statement-01.txt | draft-tsou-vrom-problem-statement-02.txt > | |||
|---|---|---|---|---|
| Internet Engineering Task Force V. Grado | Internet Engineering Task Force V. Grado | |||
| Internet-Draft T. Tsou | Internet-Draft T. Tsou | |||
| Intended status: Informational Huawei Technologies (USA) | Intended status: Informational Huawei Technologies | |||
| Expires: December 14, 2011 N. So | Expires: December 15, 2011 N. So | |||
| Verizon Communications Inc. | Verizon Communications Inc. | |||
| June 12, 2011 | June 13, 2011 | |||
| Virtual Resource Operations and Management in the Data Center | Virtual Resource Operations & Management in the Data Center | |||
| draft-tsou-vrom-problem-statement-01 | draft-tsou-vrom-problem-statement-02 | |||
| Abstract | Abstract | |||
| The dynamic allocation of computing resources on a massive scale | TThe dynamic allocation of computing resources on a massive scale | |||
| through the use of virtual machines running over a "hypervisor" layer | through the use of virtual machines to serve a large number of | |||
| to serve a large number of customers and applications simultaneously | customers and applications simultaneously brings a number of benefits | |||
| brings a number of benefits but also a number of challenges to data | but also a number of challenges to data center operations. Such | |||
| center operations. Such challenges range from acquiring the | challenges range from acquiring the information needed to provision | |||
| information needed to provision the physical servers, storage and | the physical servers, storage and networking elements, through | |||
| networking elements, through accounting for resource and application | accounting for resource and application usage at the user level. In | |||
| usage at the user level. The Distributed Management Task Force | particular, this document describes the problem of operational and | |||
| (DMTF) has begun the work of developing the standards needed to | management challenges that virtualization brings in the (carrier) | |||
| support this work, but many tasks remain. This document provides a | data center as an enabler of new technologies such as self- | |||
| brief survey of the problem space, but focusses on the requirements | provisioning and elastic capacity and related benefits of | |||
| for operation and management of network resources within the data | consolidation, reduced total cost of ownership, and energy | |||
| center complex and between that complex and the users. | management. This document does not cover the problem of address | |||
| resolution in massive data centers. It does not cover technologies | ||||
| related to VDI either. | ||||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on December 14, 2011. | This Internet-Draft will expire on December 15, 2011. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2011 IETF Trust and the persons identified as the | Copyright (c) 2011 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1. Requirements Language . . . . . . . . . . . . . . . . . . . 3 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2. Operational Challenges for Virtualization . . . . . . . . . . . 4 | 3. Operational Challenges for Virtualization . . . . . . . . . . . 4 | |||
| 2.1. A More Detailed Look At the Hypervisor . . . . . . . . . . 4 | 3.1. Unique Requirements from Virtualization . . . . . . . . . . 4 | |||
| 2.2. Operations and Management in a Virtualized Data Center . . 4 | 3.2. Operations and Management in a Virtualized DC . . . . . . . 5 | |||
| 3. Real and Virtual Network Management in the Virtualized | 4. VM Performance and Configuration Management Challenges . . . . 6 | |||
| Data Center . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 4.1. Performance Management Challenges in Virtualization . . . . 6 | |||
| 4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 4.2. VM Configuration and Inventory Operational Challenges . . . 6 | |||
| 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 6 | 5. Operational Challenges in Services with Virtual Resources . . . 6 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 | 5.1. VM Migration Operational Challenges . . . . . . . . . . . . 7 | |||
| 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 6 | 5.2. VPC Operational and Management Challenges . . . . . . . . . 7 | |||
| 8. Informative References . . . . . . . . . . . . . . . . . . . . 6 | 6. Conclusion and Recommendation . . . . . . . . . . . . . . . . . 8 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 6 | 7. Manageability Considerations . . . . . . . . . . . . . . . . . 8 | |||
| 8. Security Considerations . . . . . . . . . . . . . . . . . . . . 8 | ||||
| 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 8 | ||||
| 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 8 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 8 | ||||
| 1. Introduction | 1. Introduction | |||
| There is currently a strong movement toward virtualization of data | There is currently a strong movement toward virtualization of data | |||
| center resources, with the aim of improving physical resource | center resources, with the aim of improving physical resource | |||
| utilization, reducing energy consumption as a result, and improving | utilization, reducing energy consumption as a result, and improving | |||
| responsiveness to demands for data center resources. Along with this | responsiveness to demands for data center resources. Along with this | |||
| is a parallel movement toward outsourcing data center operations, | is a parallel movement toward outsourcing data center operations, | |||
| with the result that multiple enterprises may share the same physical | with the result that multiple enterprises may share the same physical | |||
| resources for their own computing and storage requirements. Both in- | resources for their own computing and storage requirements. Both in- | |||
| house and outsourced data center virtualization raise obvious | house and outsourced data center virtualization raise obvious | |||
| concerns over data security and regulatory compliance, but this is | concerns over data security and regulatory compliance, but this is | |||
| just one aspect of the operational and management challenges raised | just one aspect of the operational and management challenges raised | |||
| by large-scale resource virtualization. | by large-scale resource virtualization. | |||
| The basic unit of resource virtualization in this architecture is the | IThe basic unit of resource virtualization in this architecture is | |||
| virtual machine (VM), running over a "hypervisor" layer and sharing a | the virtual machine (VM), running over a "hypervisor" layer and | |||
| physical server with other virtual machines and a management entity. | sharing a physical server with other virtual machines and a | |||
| The virtual machine has its own guest operating system, set of one or | management entity. The virtual machine has its own guest operating | |||
| more applications, and allocations of processing, storage, and | system, set of one or more applications, and allocations of | |||
| networking resources. The Distributed Management Task Force (DMTF) | processing, storage, and networking resources. The need to mix and | |||
| has provided a standard interface for management of the virtual | match products from different vendors can lead to interoperability | |||
| machine life cycle, the Open Virtualization Format [OVF]. | challenges that need to be addressed by standards from the start, or | |||
| risk vendor lock-in. | ||||
| Within the data center complex, virtual machines may migrate from one | This document focuses on the problem statement of various data center | |||
| set of physical resources to another. The data center complex may | virtual resources operations and management areas. This document | |||
| itself be distributed geographically, and resources for a single | does not cover the problem of address resolution in massive data | |||
| virtual machine may be spread over multiple locations. This raises | centers nor the problem of technologies known as VDI. | |||
| the importance of ensuring adequate and well-running network | ||||
| resources within the data center complex. | ||||
| The next section is a slightly more detailed description of the | 2. Terminology | |||
| interaction between the hypervisor and the virtual machines it | ||||
| supports, followed by a general enumeration of the complete range of | ||||
| operations and management issues associated with massive | ||||
| virtualization within the data center complex. The following section | ||||
| looks in more detail at the problem of operating and managing the | ||||
| virtual and physical networking resources within the complex, with | ||||
| the aim of laying the groundwork for identifying gaps in the existing | ||||
| set of standards in this area. The concluding section actually | ||||
| identifies those gaps. | ||||
| 1.1. Requirements Language | CE: Customer Edge | |||
| This document contains no normative language. | DC: Data Center | |||
| 2. Operational Challenges for Virtualization | DE: Data Center Edge | |||
| 2.1. A More Detailed Look At the Hypervisor | PE: Provider Edge | |||
| With virtualized resources, a virtual machine (VM) embodies virtual | VDC: Virtualized Data Center | |||
| hardware that is emulated by a hypervisor (or a similar mechanism | ||||
| with respect to virtual networking resources. The hypervisor | ||||
| mediates all interactions with the underlying physical hardware. | ||||
| That mechanism is transparent to the guest operating system, which | ||||
| runs completely independently of other VMs sharing the same physical | ||||
| resources. | ||||
| The hypervisor performs the mapping between the virtual resources of | VDI: Virtualized Desktop Infrastructure | |||
| the VM (usually an application and a guest operating system) and the | ||||
| physical hardware of a server, storage, or network. The hypervisor | VPC: Virtual Private Clouds (or VPN based Clouds) | |||
| is the component responsible for managing physical resources to | ||||
| allocate them fairly to the multiple VMs running on a host. | VM: Virtual Machine (or host) | |||
| SLA: Service Level Agreement | ||||
| 3. Operational Challenges for Virtualization | ||||
| 3.1. Unique Requirements from Virtualization | ||||
| There are operational challenges and requirements ensuing from | ||||
| virtualized resources that are unique and not present in conventional | ||||
| implementations. | ||||
| In virtualized resources, a virtual machine (VM) embodies virtual | ||||
| hardware that is emulated by a hypervisor (or a similar mechanism in | ||||
| virtual networking resources), that mediates all interactions with | ||||
| the underlying physical hardware. That mechanism is transparent to | ||||
| the guest operating system, running completely independent of other | ||||
| guests VM in the same physical resources. | ||||
| The hypervisor then performs the mapping between the virtual | ||||
| resources of the VM (usually an application and a guest operating | ||||
| system) and the physical hardware of a server, storage, or network. | ||||
| The hypervisor is the component responsible for managing physical | ||||
| resources to allocate them fairly to the multiple VMs running on a | ||||
| host. | ||||
| The main physical resource pools that the hypervisor needs to manage | The main physical resource pools that the hypervisor needs to manage | |||
| to carry out its job are as follows: | for carrying out its job are as follows: | |||
| o CPU: A configurable amount of CPU assigned to a VM, during | o CPU: An configurable amount of CPU assigned to a VM, during | |||
| creation, regardless of the real amount of physical CPU. The | creation, regardless of the real amount of physical CPU. A CPU | |||
| hypervisor uses a CPU scheduler to process the CPU requests from | scheduler is used by the hypervisor to process the CPU requests | |||
| the VMs. | from the VMs. | |||
| o Disk: A single large file allocated on one the host's datastores | o Disk: A single large file allocated on one the host's datastores | |||
| as a virtual disk for each VM. Disk I/O requests are also queued | as a virtual disk for each VM. Disk I/O requests are also queued | |||
| for each VM. | for each VM. | |||
| o Memory: A fixed amount of memory that gets mapped into virtual | o Memory: A fixed amount of memory that gets mapped into virtual | |||
| memory pages and in turn to physical memory pages. The hypervisor | memory pages and in turn to physical memory pages. The hypervisor | |||
| must ensure there is no overallocation of virtual memory that the | must ensure there is no overallocation of virtual memory that the | |||
| physical memory cannot handle. | physical memory cannot handle. | |||
| o Network: The virtual machine includes a virtual network to provide | o Network: It includes a virtual network to provide the same | |||
| the same functionality as a physical network, including IP | functionality as a physical network, including IP address, virtual | |||
| address, virtual NIC, switches and firewalls. Some network | NIC, switches and firewalls. Because the network traffic is only | |||
| traffic passes only between VMs on the same host, and will not be | handled between VMs, many times there is no visibility to external | |||
| visible to external physical tools. | physical tools. | |||
| 2.2. Operations and Management in a Virtualized Data Center | 3.2. Operations and Management in a Virtualized DC | |||
| [PTT] We can add material to expand this, but bear in mind that it is | From the above, a number of operational challenges arise in a | |||
| just an introductory section and need not get too detailed. | virtualized environment that cover different leves of service in a | |||
| DC. Some of challenges are: | ||||
| From the brief description given above, one can infer a number of | 1. New devices and elements | |||
| operational challenges that arise in a virtualized environment to | ||||
| cover different levels of service in a data center. Some of | ||||
| challenges are: | ||||
| 1. Impact of new devices and elements: | * Monitor VM lifecycle, including VM migration ("lift & shift") | |||
| * monitoring of the VM life cycle, including VM migration ("lift | * Address management for VM lifecycle support | |||
| and shift"); | ||||
| * address management for VM life cycle support; | * Resource monitoring for faults and abnormal conditions | |||
| * resource monitoring for faults and abnormal conditions; | * Resource availability, peformance metrics and usage based | |||
| metering | ||||
| * metering of resource availability, performance metrics and | * Hypervisor status and interface monitoring | |||
| usage; | ||||
| * monitoring of the status of the hypervisor and the interface | 2. Infrastructure management support | |||
| to it. | ||||
| 2. Infrastructure management support: | * Connectivity needs for virtualization management | |||
| * connectivity needs for virtualization management; | * Policy management and enforcement in the Virtualized DC | |||
| * policy management and enforcement; | * IPFIX for virtualization performance management | |||
| * virtualization performance management; | * Interoperability of multiple hypervisors | |||
| * interoperability of multiple hypervisors; | * Open programmatic interfaces to support access and management | |||
| of Datacenter contents and resources | ||||
| * open programmatic interfaces to support access and management | 3. Service management enablement | |||
| of data center contents and resources | ||||
| 3. Enabling service management: | * Supporting secure low-latency VLAN and VPN connections in | |||
| large scale on on-demand (pay as you go) basis for capacity | ||||
| management of dedicated pool of resources | ||||
| * supporting secure low-latency VLAN and VPN connections in | * Service hosting, co-location, and distributed virtualized | |||
| large scale on an on-demand (pay as you go) basis for capacity | redundancy for seamless scaling | |||
| management of dedicated pools of resources | ||||
| * scalable service hosting, collocation, and distributed | * Facilities management including premises, security, privacy | |||
| virtualized redundancy | and data integrity management for regulatory compliancy | |||
| * facilities management including premises, security, privacy, | * Management of VPN-based clouds | |||
| and data integrity management for regulatory compliance; | ||||
| * management of virtual private data centers, VPN-based data | 4. VM Performance and Configuration Management Challenges | |||
| centers. | ||||
| 3. Real and Virtual Network Management in the Virtualized Data Center | 4.1. Performance Management Challenges in Virtualization | |||
| [PTT] Time is pressing, so I'll propose text for this later, unless | From the discussion in the previous sections, it is clear that | |||
| someone else can do it. Basically we have to monitor at three | performance management for virtualized resources is a very critical | |||
| levels: virtual network connections (rely on the hypervisor for | area. This can include capacity management, as well as availability | |||
| that), physical connections within the (possibly distributed) data | management, since they provide a status of the health of the network | |||
| center, and the customer connections into the data center. | resources and services, including the health of the VMs and | |||
| hypervisors. | ||||
| 4. Conclusions | While a hypervisor is in charge of load balancing and keeping tab of | |||
| resource utilization, additional mechanisms need to be in place to | ||||
| obtain the best performance. There is a need to obtain key metrics | ||||
| from the VMs that can be used to support a more robust management of | ||||
| resources and services. | ||||
| [PTT] Juergen?? You'd know what tools exist now for the job and what | A protocol such as IPFIX that can tap into VMs to obtain flow metrics | |||
| needs development. | needs to be devised (or existing ones enhanced). The source of the | |||
| metrics for virtualized resources are diferent from the sources in | ||||
| physical resources, as described above. | ||||
| 5. Acknowledgements | Metrics such as uptime that are provided by mechanisms within IPPM | |||
| and PMOL recommendations need to be obtained for virtualized hosts as | ||||
| well. | ||||
| Tom Taylor added text and may become an author unless it is necessary | 4.2. VM Configuration and Inventory Operational Challenges | |||
| to leave room for others. | ||||
| 6. IANA Considerations | Another critical challenge arising from the creation of virtual hosts | |||
| is 'sprawl' that can happen over time when there is lack of control | ||||
| and monitoring in the lifecycle of a large quantity of VMs. Besides | ||||
| service and performance problems that might arise, configuration | ||||
| management issues will ensue from VM that are consuming resources in | ||||
| the background, if unmonitored, some may become out of sync with | ||||
| policy and compliance, with fine-tuning applications, with bandwith | ||||
| management, etc. | ||||
| This memo includes no request to IANA. | There is a need to carry out configuration management for virtual | |||
| resources and hosts that include discovery, inventory and backup, for | ||||
| the both the virtual and physical resources. There is a need for a | ||||
| protocol like NETCONF that also covers virtual hosts. | ||||
| 7. Security Considerations | 5. Operational Challenges in Services with Virtual Resources | |||
| 5.1. VM Migration Operational Challenges | ||||
| Security is a very important consideration, both for private and | VM migration, also called by other names such as VM Motion, "lift & | |||
| multi-user virtualized data centers. However, detailed discussion of | shift", etc., implies moving a VM to another location within a data | |||
| that topic is out of the scope of this document. This memo raises no | center, or even to a different data center, with the consequent | |||
| security issues in itself. | operational and management challenges. | |||
| 8. Informative References | Just to name a few of the challenges: | |||
| [OVF] Distributed Management Task Force (DMTF), "Open Virtualization | o Policy reconfiguration in the destination device | |||
| Format (OVF)", January 2010, <http://dmtf.org/standards/ovf>. | ||||
| o Other dynamic information updating in destination | ||||
| o Address management and reconfiguration when involving different | ||||
| data centers | ||||
| In addition, there are a few complex models for the interconnection | ||||
| of service providers supporting virtualized resources already working | ||||
| their way into real implementations that will allow more complex VM | ||||
| migration schemes but that also represent their own set of | ||||
| operational and management challenges. | ||||
| 5.2. VPC Operational and Management Challenges | ||||
| From the virtualized resources deployment models, this model brings | ||||
| together most of the operational requirements into the unified | ||||
| computing stack, and in particular the network side, directly. VPC | ||||
| embodies services that are delivered over a virtual private network | ||||
| (VPN) and therefore the VPN protocols and implementations need to be | ||||
| enhanced to support the "characteristics" mentioned above in addition | ||||
| to their own operational requirements. At a higher level, a VPC | ||||
| needs to meet SLA and enfore policies, meet demands from the | ||||
| management of order requests, self-provisioning, usage-based | ||||
| metering, and management, either through programmatic interfaces or | ||||
| by other means. | ||||
| There are two cases, 1) the pure play virtualization-enabled | ||||
| provider, that needs to use another carrier to interconnect the | ||||
| different data centers, and 2) the carrier offering over their own | ||||
| network. In both of those cases, the VPN protocols need operational | ||||
| enhancements to support end-to-end SLA monitoring or even just | ||||
| internal service level objectives in addition to customer SLA, all | ||||
| which requires corresponding metrics, based on usage per resource and | ||||
| per customer. | ||||
| In the second case, there are additional improvements that can be | ||||
| made as in the case of the deployment of a DC Edge device (DE), a box | ||||
| in between the DC and the network edge that will simplify | ||||
| provisioning and operations by eliminating the need of a CE-PE pair. | ||||
| 6. Conclusion and Recommendation | ||||
| With the new networking, server and storage technologies converging | ||||
| in to the DC in the form of unified computing solutions at whose core | ||||
| is a virtualization stack, many new operational and management | ||||
| challenges arise. | ||||
| Therefore, we recommend that the IETF engage in the study of the | ||||
| problem of virtualized resources operations and management and, if | ||||
| appropriate, the development of interoperable solutions. | ||||
| 7. Manageability Considerations | ||||
| This document does not add additional manageability considerations. | ||||
| 8. Security Considerations | ||||
| To come. | ||||
| 9. IANA Considerations | ||||
| This memo currently includes no request to IANA. | ||||
| 10. Acknowledgements | ||||
| Awaiting comments. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Victor M. Grado | Victor M. Grado | |||
| Huawei Technologies (USA) | Huawei Technologies | |||
| 2330 Central Expwy, | 2330 Central Expwy | |||
| Santa Clara,, CA 95050 | Santa Clara, CA 95050 | |||
| USA | US | |||
| Phone: | Phone: | |||
| Email: vgrado@huawei.com | Email: vgrado@huawei.com | |||
| Tina Tsou | Tina Tsou | |||
| Huawei Technologies (USA) | Huawei Technologies | |||
| 2330 Central Expwy, | 2330 Central Expwy | |||
| Santa Clara,, CA 95050 | Santa Clara, CA 95050 | |||
| USA | US | |||
| Phone: | Phone: | |||
| Email: tena@huawei.com | Email: tena@huawei.com | |||
| Ning So | Ning So | |||
| Verizon Communications Inc. | Verizon Communications Inc. | |||
| 2400 N. Glenville Ave, | 2400 N. Glenville Ave | |||
| Richardson,, TX 75080 | Richardson, TX 75080 | |||
| USA | US | |||
| Phone: | Phone: | |||
| Email: ning.so@verizonbusiness.com | Email: ning.so@verizonbusiness.com | |||
| End of changes. 55 change blocks. | ||||
| 149 lines changed or deleted | 239 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||