| < draft-ietf-nvo3-hpvr2nve-cp-req-00.txt | draft-ietf-nvo3-hpvr2nve-cp-req-01.txt > | |||
|---|---|---|---|---|
| NVO3 Working Group Yizhou Li | NVO3 Working Group Yizhou Li | |||
| INTERNET-DRAFT Lucy Yong | INTERNET-DRAFT Lucy Yong | |||
| Intended Status: Informational Huawei Technologies | Intended Status: Informational Huawei Technologies | |||
| Lawrence Kreeger | Lawrence Kreeger | |||
| Cisco | Cisco | |||
| Thomas Narten | Thomas Narten | |||
| IBM | IBM | |||
| David Black | David Black | |||
| EMC | EMC | |||
| Expires: January 2, 2015 July 1, 2014 | Expires: May 22, 2015 November 18, 2014 | |||
| Hypervisor to NVE Control Plane Requirements | Hypervisor to NVE Control Plane Requirements | |||
| draft-ietf-nvo3-hpvr2nve-cp-req-00 | draft-ietf-nvo3-hpvr2nve-cp-req-01 | |||
| Abstract | Abstract | |||
| This document describes the control plane protocol requirements when | In a Split-NVE architructure, the functions of the NVE are split | |||
| NVE is not co-located with the hypervisor on a server. A control | across the hypervisor/container on a server and an external network | |||
| plane protocol (or protocols) between a hypervisor and its associated | equipment which is called an external NVE. A control plane | |||
| external NVE(s) is used for the hypervisor to populate its virtual | protocol(s) between a hypervisor and its associated external NVE(s) | |||
| machines states to the NVE(s) for further handling. This document | is used for the hypervisor to distribute its virtual machine | |||
| illustrates the functionalities required by such control plane | networking state to the external NVE(s) for further handling. This | |||
| signaling protocols and outlines the high level requirements to be | document illustrates the functionality required by this type of | |||
| fulfiled. Virtual machine states and state transitioning are | control plane signaling protocol and outlines the high level | |||
| summarized to help clarifying the needed requirements. | requirements. Virtual machine states as well as state transitioning | |||
| are summarized to help clarifying the needed protocol requirements. | ||||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted to IETF in full conformance with the | This Internet-Draft is submitted to IETF in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as | other groups may also distribute working documents as | |||
| Internet-Drafts. | Internet-Drafts. | |||
| skipping to change at page 2, line 25 ¶ | skipping to change at page 2, line 25 ¶ | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 1.2 Target Scenarios . . . . . . . . . . . . . . . . . . . . . 4 | 1.2 Target Scenarios . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 2. VM Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2. VM Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 2.1 VM Creation . . . . . . . . . . . . . . . . . . . . . . . . 6 | 2.1 VM Creation Event . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 2.2 VM Live Migration . . . . . . . . . . . . . . . . . . . . . 7 | 2.2 VM Live Migration Event . . . . . . . . . . . . . . . . . . 8 | |||
| 2.3 VM termination . . . . . . . . . . . . . . . . . . . . . . . 7 | 2.3 VM Termination Event . . . . . . . . . . . . . . . . . . . . 9 | |||
| 2.4 VM Pause, suspension and resumption . . . . . . . . . . . . 8 | 2.4 VM Pause, Suspension and Resumption Events . . . . . . . . . 9 | |||
| 3. Hypervisor-to-NVE Signaling protocol functionality . . . . . . 8 | 3. Hypervisor-to-NVE Control Plane Protocol Functionality . . . . 9 | |||
| 3.1 VN connect and disconnect . . . . . . . . . . . . . . . . . 8 | 3.1 VN connect and Disconnect . . . . . . . . . . . . . . . . . 10 | |||
| 3.2 TSI associate and activate . . . . . . . . . . . . . . . . . 10 | 3.2 TSI Associate and Activate . . . . . . . . . . . . . . . . . 11 | |||
| 3.3 TSI disassociate, deactivate and clear . . . . . . . . . . . 13 | 3.3 TSI Disassociate and Deactivate . . . . . . . . . . . . . . 14 | |||
| 4. Hypervisor-to-NVE Signaling Protocol requirements . . . . . . . 13 | 4. Hypervisor-to-NVE Control Plane Protocol Requirements . . . . . 15 | |||
| 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 14 | 5. Security Considerations . . . . . . . . . . . . . . . . . . . . 16 | |||
| 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 15 | 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 17 | |||
| 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15 | 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 | |||
| 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 | 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 | |||
| 8.1 Normative References . . . . . . . . . . . . . . . . . . . 15 | 8.1 Normative References . . . . . . . . . . . . . . . . . . . 17 | |||
| 8.2 Informative References . . . . . . . . . . . . . . . . . . 15 | 8.2 Informative References . . . . . . . . . . . . . . . . . . 17 | |||
| Appendix A. IEEE 802.1Qbg VDP Illustration (For information | Appendix A. IEEE 802.1Qbg VDP Illustration (For information | |||
| only) . . . . . . . . . . . . . . . . . . . . . . . . . . 16 | only) . . . . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 20 | |||
| 1. Introduction | 1. Introduction | |||
| This document describes the control plane protocol requirements when | In the Split-NVE architecture shown in Figure 1, the functionality of | |||
| NVE is not co-located with the hypervisor on a server. A control | the NVE is split across an end device supporting virtualization and | |||
| plane protocol (or protocols) between a hypervisor and its associated | an external network device which is called an external NVE. The | |||
| external NVE(s) is used for the hypervisor to populate its virtual | portion of the NVE functionality located on the hypervisor/container | |||
| machines states to the NVE(s) for further handling. This protocol is | is called the tNVE and the portion located on the external NVE is | |||
| mentioned in NVO3 problem statement [I-D.ietf-nvo3-overlay-problem- | called the nNVE in this document. Overlay encapsulation/decapsulation | |||
| statement] as the third work item. When TS and NVE are on the | functions are normally off-loaded to the nNVE on the external NVE. | |||
| separate devices, we also call it split TS-NVE architecture and it is | The tNVE is normally implemented as a part of hypervisor or container | |||
| the primary interest in this document. | in an virtualized end device. | |||
| The problem statement [RFC7364], discusses the needs for a control | ||||
| plane protocol (or protocols) to populate each NVE with the state | ||||
| needed to perform the required functions. In one scenario, an NVE | ||||
| provides overlay encapsulation/decapsulation packet forwarding | ||||
| services to Tenant Systems (TSs) that are co-resident within the NVE | ||||
| on the same End Device (e.g. when the NVE is embedded within a | ||||
| hypervisor or a Network Service Appliance). In such cases, there is | ||||
| no need for a standardized protocol between the hypervisor and NVE, | ||||
| as the interaction is implemented via software on a single device. | ||||
| While in the Split-NVE architecture scenarios, as shown in figure 2 | ||||
| to figure 4, a control plane protocol(s) between a hypervisor and its | ||||
| associated external NVE(s) is required for the hypervisor to | ||||
| distribute the virtual machines networking states to the NVE(s) for | ||||
| further handling. The protocol indeed is an NVE-internal protocol and | ||||
| runs between tNVE and nNVE logical entities. This protocol is | ||||
| mentioned in NVO3 problem statement [RFC7364] and appears as the | ||||
| third work item. | ||||
| Virtual machine states and state transitioning are summarized in this | Virtual machine states and state transitioning are summarized in this | |||
| document to illustrates the functionalities required by the control | document to show events where the NVE needs to take specific actions. | |||
| plane signaling protocols between hypervisor and the external NVE. | Such events might correspond to actions the control plane signaling | |||
| Then the high level requirements to be fulfiled are outlined. | protocols between the hypervisor and external NVE will need to take. | |||
| Then the high level requirements to be fulfilled are outlined. | ||||
| +-- -- -- -- Split-NVE -- -- -- --+ | ||||
| | | ||||
| | | ||||
| +---------------|-----+ | ||||
| | +------------- ----+| | | ||||
| | | +--+ +---\|/--+|| +------ --------------+ | ||||
| | | |VM|---+ ||| | \|/ | | ||||
| | | +--+ | ||| |+--------+ | | ||||
| | | +--+ | tNVE |||----- - - - - - -----|| | | | ||||
| | | |VM|---+ ||| || nNVE | | | ||||
| | | +--+ +--------+|| || | | | ||||
| | | || |+--------+ | | ||||
| | +--Hpvr/Container--+| +---------------------+ | ||||
| +---------------------+ | ||||
| End Device External NVE | ||||
| Figure 1 Split-NVE structure | ||||
| This document uses the term "hypervisor" throughout when describing | This document uses the term "hypervisor" throughout when describing | |||
| the scenario where NVE functionality is implemented on a separate | the Split-NVE scenario where part of the NVE functionality is off- | |||
| device from the "hypervisor" that contains a VM connected to a VN. | loaded to a separate device from the "hypervisor" that contains a VM | |||
| In this context, the term "hypervisor" is meant to cover any device | connected to a VN. In this context, the term "hypervisor" is meant to | |||
| type where the NVE functionality is offloaded in this fashion, e.g., | cover any device type where part of the NVE functionality is off- | |||
| a Network Service Appliance. | loaded in this fashion, e.g.,a Network Service Appliance, Linux | |||
| Container. | ||||
| This document often uses the term "VM" and "Tenant System" (TS) | This document often uses the term "VM" and "Tenant System" (TS) | |||
| interchangeably, even though a VM is just one type of Tenant System | interchangeably, even though a VM is just one type of Tenant System | |||
| that may connect to a VN. For example, a service instance within a | that may connect to a VN. For example, a service instance within a | |||
| Network Service Appliance may be another type of TS. When this | Network Service Appliance may be another type of TS, or a system | |||
| document uses the term VM, it will in most cases apply to other types | running on an OS-level virtualization technologies like LinuX | |||
| of TSs. | Containers. When this document uses the term VM, it will in most | |||
| cases apply to other types of TSs. | ||||
| Section 2 describes VM states and state transitioning in its | ||||
| lifecycle. Section 3 introduces Hypervisor-to-NVE control plane | ||||
| protocol functionality derived from VM operations and network events. | ||||
| Section 4 outlines the requirements of the control plane protocol to | ||||
| achieve the required functionality. | ||||
| 1.1 Terminology | 1.1 Terminology | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
| This document uses the same terminology as found in [I-D.ietf-nvo3- | This document uses the same terminology as found in [RFC7365] and [I- | |||
| framework] and [I-D.ietf-nvo3-nve-nva-cp-req]. This section defines | D.ietf-nvo3-nve-nva-cp-req]. This section defines additional | |||
| additional terminology used by this document. | terminology used by this document. | |||
| VN Profile: Meta data associated with a VN that is used by an NVE | Split-NVE: a type of NVE that the functionalities of it are split | |||
| when ingressing/egressing packets to/from a specific VN. Meta data | across an end device supporting virtualization and an external | |||
| could include such information as ACLs, QoS settings, etc. The VN | network device. | |||
| Profile contains parameters that apply to the VN as a whole. Control | ||||
| protocols could use the VN ID or VN Name to obtain the VN Profile. | tNVE: the portion of Split-NVE functionalities located on the end | |||
| device supporting virtualization. | ||||
| nNVE: the portion of Split-NVE functionalities located on the network | ||||
| device which is directly or indirectly connects to the end device | ||||
| holding the corresponding tNVE. | ||||
| External NVE: the physical network device holding nNVE | ||||
| Hypervisor/Container: the logical collection of software, firmware | ||||
| and/or hardware that allows the creation and running of server or | ||||
| service appliance virtualization. tNVE is located on | ||||
| Hypervisor/Container. It is loosely used in this document to refer to | ||||
| the end device supporting the virtualization. For simplicity, we also | ||||
| use Hypervisor in this document to represent both hypervisor and | ||||
| container. | ||||
| VN Profile: Meta data associated with a VN that is applied to any | ||||
| attachment point to the VN. That is, VAP properties that are appliaed | ||||
| to all VAPs associated with a given VN and used by an NVE when | ||||
| ingressing/egressing packets to/from a specific VN. Meta data could | ||||
| include such information as ACLs, QoS settings, etc. The VN Profile | ||||
| contains parameters that apply to the VN as a whole. Control | ||||
| protocols between the NVE and NVA could use the VN ID or VN Name to | ||||
| obtain the VN Profile. | ||||
| VSI: Virtual Station Interface. [IEEE 802.1Qbg] | VSI: Virtual Station Interface. [IEEE 802.1Qbg] | |||
| VDP: VSI Discovery and Configuration Protocol [IEEE 802.1Qbg] | VDP: VSI Discovery and Configuration Protocol [IEEE 802.1Qbg] | |||
| 1.2 Target Scenarios | 1.2 Target Scenarios | |||
| In split TS-NVE architecture, an external NVE can provide an offload | In the Split-NVE architecture, an external NVE can provide an offload | |||
| of the encapsulation / decapsulation function, network policy | of the encapsulation / decapsulation function, network policy | |||
| enforcement, as well as the VN Overlay protocol overheads. This | enforcement, as well as the VN Overlay protocol overhead. This | |||
| offloading may provide performance improvements and/or resource | offloading may provide performance improvements and/or resource | |||
| savings to the End Device (e.g. hypervisor) making use of the | savings to the End Device (e.g. hypervisor) making use of the | |||
| external NVE. | external NVE. | |||
| The following figures give example scenarios where the Tenant System | The following figures give example scenarios of a Split-NVE | |||
| and NVE are on different devices in split TS-NVE architecture. | architecture. | |||
| Hypervisor Access Switch | ||||
| +------------------+ +-----+-------+ | ||||
| | +--+ +-------+ | | | | | ||||
| | |VM|---| | | VLAN | | | | ||||
| | +--+ |Virtual|---------+ NVE | +--- Underlying | ||||
| | +--+ |Switch | | Trunk | | | Network | ||||
| | |VM|---| | | | | | | ||||
| | +--+ +-------+ | | | | | ||||
| +------------------+ +-----+-------+ | ||||
| Figure 1 Hypervisor with an External NVE | ||||
| Hypervisor L2 Switch NVE | ||||
| +------------------+ +-----+ +-----+ | ||||
| | +--+ +-------+ | | | | | | ||||
| | |VM|---| | | VLAN | | VLAN | | | ||||
| | +--+ |Virtual|---------+ +-------+ +--- Underlying | ||||
| | +--+ |Switch | | Trunk | | Trunk | | Network | ||||
| | |VM|---| | | | | | | | ||||
| | +--+ +-------+ | | | | | | ||||
| +------------------+ +-----+ +-----+ | ||||
| Figure 2 Hypervisor with an External NVE | ||||
| across an Ethernet Access Switch | ||||
| Network Service Appliance Access Switch | ||||
| +--------------------------+ +-----+-------+ | ||||
| | +------------+ |\ | | | | | ||||
| | |Net Service |----| \ | | | | | ||||
| | |Instance | | \ | VLAN | | | | ||||
| | +------------+ | |---------+ NVE | +--- Underlying | ||||
| | +------------+ | | | Trunk| | | Network | ||||
| | |Net Service |----| / | | | | | ||||
| | |Instance | | / | | | | | ||||
| | +------------+ |/ | | | | | ||||
| +--------------------------+ +-----+-------+ | ||||
| Figure 3 Physical Network Service Appliance with an External NVE | ||||
| We use the term hypervisor in this document to refer to the container | ||||
| that can run the control plane protocol on the device. Thus | ||||
| Hypervisor has more generic meaning which also covers the network | ||||
| service appliance device in figure 3. | ||||
| Tenant Systems connect to NVEs via a Tenant System Interface (TSI). | ||||
| The TSI logically connects to the NVE via a Virtual Access Point | ||||
| (VAP) [I-D.ietf-nvo3-arch]. NVE may provide Layer 2 or Layer 3 | ||||
| forwarding. In split TS-NVE architecture, external NVE may be able to | ||||
| reach multiple MAC and IP addresses via a TSI. For example, Tenant | ||||
| Systems that are providing network services (such as firewall, load | ||||
| balancer, VPN gateway) are likely to have complex address hierarchy. | ||||
| It implies if a given TSI disassociates from one VN, all the MAC and | ||||
| IP addresses are also disassociated. There is no need to signal the | ||||
| deletion of every MAC or IP when the TSI is brought down or deleted. | ||||
| In the majority of cases, a VM will be acting as a simple host that | ||||
| will have a single TSI and single MAC and IP visible to the external | ||||
| NVE. | ||||
| 1.3 Motivations and Purpose | Hypervisor Access Switch | |||
| +------------------+ +-----+-------+ | ||||
| | +--+ +-------+ | | | | | ||||
| | |VM|---| | | VLAN | | | | ||||
| | +--+ | tNVE |---------+ nNVE| +--- Underlying | ||||
| | +--+ | | | Trunk | | | Network | ||||
| | |VM|---| | | | | | | ||||
| | +--+ +-------+ | | | | | ||||
| +------------------+ +-----+-------+ | ||||
| Figure 2 Hypervisor with an External NVE | ||||
| The problem statement [I-D.ietf-nvo3-overlay-problem-statement], | Hypervisor L2 Switch | |||
| discusses the needs for a control plane protocol (or protocols) to | +---------------+ +-----+ +----+---+ | |||
| populate each NVE with the state needed to perform its functions. | | +--+ +----+ | | | | | | | |||
| | |VM|---| | |VLAN | |VLAN | | | | ||||
| | +--+ |tNVE|-------+ +-----+nNVE| +--- Underlying | ||||
| | +--+ | | |Trunk| |Trunk| | | Network | ||||
| | |VM|---| | | | | | | | | ||||
| | +--+ +----+ | | | | | | | ||||
| +---------------+ +-----+ +----+---+ | ||||
| Figure 3 Hypervisor with an External NVE | ||||
| across an Ethernet Access Switch | ||||
| In one common scenario, an NVE provides overlay | Network Service Appliance Access Switch | |||
| encapsulation/decapsulation packet forwarding services to Tenant | +--------------------------+ +-----+-------+ | |||
| Systems (TSs) that are co-resident with the NVE on the same End | | +------------+ | \ | | | | | |||
| Device (e.g. when the NVE is embedded within a hypervisor or a | | |Net Service |----| \ | | | | | |||
| Network Service Appliance). In such cases, there is no need for a | | |Instance | | \ | VLAN | | | | |||
| standardized protocol between the hypervisor and NVE, as the | | +------------+ |tNVE| |------+nNVE | +--- Underlying | |||
| interaction is implemented via software on a single device. While in | | +------------+ | | | Trunk| | | Network | |||
| the split TS-NVE architecture scenarios, as shown in figure 1, some | | |Net Service |----| / | | | | | |||
| control plane signaling protocol needs to run between hypervisor and | | |Instance | | / | | | | | |||
| external NVE to pass the relevant state information. Such interaction | | +------------+ | / | | | | | |||
| is mandatory. This document will identify the requirements for such | +--------------------------+ +-----+-------+ | |||
| signaling protocol. | Figure 4 Physical Network Service Appliance with an External NVE | |||
| Section 2 describes VM states and state transitioning in its | Tenant Systems connect to external NVEs via a Tenant System Interface | |||
| lifecycle. Section 3 introduces Hypervisor-to-NVE signaling protocol | (TSI). The TSI logically connects to the external NVE via a Virtual | |||
| functionality derived from VM operations and network events. Section | Access Point (VAP) [I-D.ietf-nvo3-arch]. The external NVE may provide | |||
| 4 outlines the requirements of the control plane protocol to achieve | Layer 2 or Layer 3 forwarding. In the Split-NVE architecture, the | |||
| the required functionality. | external NVE may be able to reach multiple MAC and IP addresses via a | |||
| TSI. For example, Tenant Systems that are providing network services | ||||
| (such as transparent firewall, load balancer, VPN gateway) are likely | ||||
| to have complex address hierarchy. This implies that if a given TSI | ||||
| disassociates from one VN, all the MAC and/or IP addresses are also | ||||
| disassociated. There is no need to signal the deletion of every MAC | ||||
| or IP when the TSI is brought down or deleted. In the majority of | ||||
| cases, a VM will be acting as a simple host that will have a single | ||||
| TSI and single MAC and IP visible to the external NVE. | ||||
| 2. VM Lifecycle | 2. VM Lifecycle | |||
| [I-D.ietf-opsawg-vmm-mib] shows the state transition of a VM in its | Figure 2 of [I-D.ietf-opsawg-vmm-mib] shows the state transition of a | |||
| figure 2. Some of the VM states are of the interest to the external | VM. Some of the VM states are of interest to the external NVE. This | |||
| NVE. This section illustrates the relevant phases or event in VM | section illustrates the relevant phases and events in the VM | |||
| lifecycle. It should be noted that the following subsections do not | lifecycle. It should be noted that the following subsections do not | |||
| give an exhaustive traversal of VM lifecycle state. They are intended | give an exhaustive traversal of VM lifecycle state. They are intended | |||
| as the illustrative examples which are relevant to split TS-NVE | as the illustrative examples which are relevant to Split-NVE | |||
| architecture, not as prescriptive text; the goal is to capture | architecture, not as prescriptive text; the goal is to capture | |||
| sufficient detail to set a context for the signaling protocol | sufficient detail to set a context for the signaling protocol | |||
| functionality and requirements described in the following sections. | functionality and requirements described in the following sections. | |||
| 2.1 VM Creation | 2.1 VM Creation Event | |||
| VM creation runs through the states in the order of preparing, | VM creation event makes the VM state transiting from Preparing to | |||
| shutdown and running [I-D.ietf-opsawg-vmm-mib]. The end device | Shutdown and then to Running [I-D.ietf-opsawg-vmm-mib]. The end | |||
| allocates and initializes local virtual resources like storage in the | device allocates and initializes local virtual resources like storage | |||
| VM preparing state. In shutdown state, VM has everything ready except | in the VM Preparing state. In Shutdown state, the VM has everything | |||
| that CPU execution is not scheduled by the hypervisor and VM's memory | ready except that CPU execution is not scheduled by the hypervisor | |||
| is not resident in the hypervisor. From the shutdown state to running | and VM's memory is not resident in the hypervisor. From the Shutdown | |||
| state, normally it requires the human execution or system triggered | state to Running state, normally it requires the human execution or | |||
| event. Running state indicates the VM is in the normal execution | system triggered event. Running state indicates the VM is in the | |||
| state. Frame can be sent and received correctly. No ongoing | normal execution state. As part of transitioning the VM to the | |||
| migration, suspension or shutdown is in process. | Running state, the hypervisor must also provision network | |||
| connectivity for the VM's TSI(s) so that Ethernet frames can be sent | ||||
| and received correctly. No ongoing migration, suspension or shutdown | ||||
| is in process. | ||||
| In VM creation phase, tenant system has to be associated with the | In the VM creation phase, the VM's TSI has to be associated with the | |||
| external NVE. Association here indicates that hypervisor and the | external NVE. Association here indicates that hypervisor and the | |||
| external NVE have signaled each other and reached some agreement. | external NVE have signaled each other and reached some agreement. | |||
| Relevant parameters or information have been provisioned properly. | Relevant networking parameters or information have been provisioned | |||
| External NVE should be informed with VM's MAC address and/or IP | properly. The External NVE should be informed of the VM's TSI MAC | |||
| address. Another example is that hypervisor may use a locally | address and/or IP address. In addition to external network | |||
| significant VLAN ID to indicate the traffic destined to a specified | connectivity, the hypervisor may provide local network connectivity | |||
| VN. Both hypervisor and NVE sides should agree on that VID value for | between the VM's TSI and other VM's TSI that are co-resident on the | |||
| later traffic identification and forwarding. | same hypervisor. When the intra or inter-hypervisor connectivity is | |||
| extended to the external NVE, a locally significant tag, e.g. VLAN | ||||
| ID, should be used between the hypervisor and the external NVE to | ||||
| differentiate each VN's traffic. Both the hypervisor and external NVE | ||||
| sides must agree on that tag value for traffic identification, | ||||
| isolation and forwarding. | ||||
| External NVE needs to do some preparation work before it signals | The external NVE may need to do some preparation work before it | |||
| successful association with tenant system. Such preparation work may | signals successful association with TSI. Such preparation work may | |||
| include locally saving the states and binding information of the | include locally saving the states and binding information of the | |||
| tenant system and its VN, communicating with peer NVEs and/or NVA for | tenant system interface and its VN, communicating with the NVA for | |||
| network provisioning, etc. | network provisioning, etc. | |||
| Tenant System association should be performed before VM enters | Tenant System interface association should be performed before the VM | |||
| running state, preferably in shutdown state. If association with | enters running state, preferably in Shutdown state. If association | |||
| external NVE fails, VM should not go into running state. | with external NVE fails, the VM should not go into running state. | |||
| 2.2 VM Live Migration | 2.2 VM Live Migration Event | |||
| Live migration is sometimes referred to as "hot" migration, in that | Live migration is sometimes referred to as "hot" migration, in that | |||
| from an external viewpoint, the VM appears to continue to run while | from an external viewpoint, the VM appears to continue to run while | |||
| being migrated to another server (e.g., TCP connections generally | being migrated to another server (e.g., TCP connections generally | |||
| survive this class of migration). In contrast, suspend/resume (or | survive this class of migration). In contrast, "cold" migration | |||
| "cold") migration consists of suspending VM execution on one server | consists of shutdown VM execution on one server and restart it on | |||
| and resuming it on another. For simplicity, the following abstract | another. For simplicity, the following abstract summary about live | |||
| summary about live migration assumes shared storage, so that the VM's | migration assumes shared storage, so that the VM's storage is | |||
| storage is accessible to the source and destination servers. Assume | accessible to the source and destination servers. Assume VM live | |||
| VM migrates from hypervisor 1 to hypervisor 2. VM live migration | migrates from hypervisor 1 to hypervisor 2. Such migration event | |||
| involves the state transition on both hypervisors, source hypervisor | involves the state transition on both hypervisors, source hypervisor | |||
| 1 and destination hypervisor 2. VM state on source hypervisor 1 | 1 and destination hypervisor 2. VM state on source hypervisor 1 | |||
| transits from running to migrating and then to shutdown [I-D.ietf- | transits from Running to Migrating and then to Shutdown [I-D.ietf- | |||
| opsawg-vmm-mib]. VM state on destination hypervisor 2 transits from | opsawg-vmm-mib]. VM state on destination hypervisor 2 transits from | |||
| shutdown to migrating and then running. | Shutdown to Migrating and then Running. | |||
| External NVE connecting to destination hypervisor 2 has to associate | The external NVE connected to destination hypervisor 2 has to | |||
| the migrating VM with it by saving VM's MAC and/or IP addresses, its | associate the migrating VM's TSI with it by discovering the TSI's MAC | |||
| VN, locally significant VID if any, and provisioning other network | and/or IP addresses, its VN, locally significant VID if any, and | |||
| related parameters of VM. The NVE may be informed about the VM's peer | provisioning other network related parameters of the TSI. The | |||
| VMs, storage devices and other network appliances with which the VM | external NVE may be informed about the VM's peer VMs, storage devices | |||
| needs to communicate or is communicating. VM on destination | and other network appliances with which the VM needs to communicate | |||
| hypervisor 2 SHOULD not go to running state before all the network | or is communicating. The migrated VM on destination hypervisor 2 | |||
| provisioning and binding has been done. | SHOULD not go to Running state before all the network provisioning | |||
| and binding has been done. | ||||
| VM on source hypervisor and destination hypervisor SHOULD not be in | The migrating VM SHOULD not be in Running state at the same time on | |||
| running state at the same time during migration. VM on source | the source hypervisor and destination hypervisor during migration. | |||
| hypervisor goes into shutdown state only when VM on destination | The VM on the source hypervisor does not transition into Shutdown | |||
| hypervisor has successfully been entering the running state. It is | state until the VM successfully enters the Running state on the | |||
| possible that VM on the source hypervisor stays in migrating state | destination hypervisor. It is possible that VM on the source | |||
| for a while after VM on the destination hypervisor is in running | hypervisor stays in Migrating state for a while after VM on the | |||
| state. | destination hypervisor is in Running state. | |||
| 2.3 VM termination | 2.3 VM Termination Event | |||
| VM termination is also referred to as "powering off" a VM. VM | VM termination event is also referred to as "powering off" a VM. VM | |||
| termination leads its state going to shutdown. There are two possible | termination event leads to its state going to Shutdown. There are two | |||
| causes to terminate a VM [I-D.ietf-opsawg-vmm-mib], one is the normal | possible causes to terminate a VM [I-D.ietf-opsawg-vmm-mib], one is | |||
| "power off" of a running VM; the other is that VM has been migrated | the normal "power off" of a running VM; the other is that VM has been | |||
| to other place and the VM image on the source hypervisor has to stop | migrated to another hypervisor and the VM image on the source | |||
| executing and to be shutdown. | hypervisor has to stop executing and to be shutdown. | |||
| In VM termination, the external NVE connecting to that VM needs to | In VM termination, the external NVE connecting to that VM needs to | |||
| deprovision the VM, i.e. delete the network parameters associated | deprovision the VM, i.e. delete the network parameters associated | |||
| with that VM. In other words, external NVE has to de-associate the | with that VM. In other words, the external NVE has to de-associate | |||
| VM. | the VM's TSI. | |||
| 2.4 VM Pause, suspension and resumption | 2.4 VM Pause, Suspension and Resumption Events | |||
| VM pause event leads VM transiting from running state to paused | The VM pause event leads to the VM transiting from Running state to | |||
| state. Paused state indicates VM is resident in memory but no longer | Paused state. The Paused state indicates that the VM is resident in | |||
| scheduled to execute by the hypervisor [I-D.ietf-opsawg-vmm-mib]. VM | memory but no longer scheduled to execute by the hypervisor [I- | |||
| can be easily re-activated from paused state to running state. | D.ietf-opsawg-vmm-mib]. The VM can be easily re-activated from Paused | |||
| state to Running state. | ||||
| VM suspension leads VM to transit state from running to suspended and | The VM suspension event leads to the VM transiting from Running state | |||
| VM resumption leads VM to transit state from suspended to running. | to Suspended state. The VM resumption event leads to the VM | |||
| Suspended state means the memory and CPU execution state of the | transiting state from Suspended state to Running state. Suspended | |||
| virtual machine are saved to persistent store. During this state, | state means the memory and CPU execution state of the virtual machine | |||
| the virtual machine is not scheduled to execute by the hypervisor [I- | are saved to persistent store. During this state, the virtual | |||
| D.ietf-opsawg-vmm-mib]. | machine is not scheduled to execute by the hypervisor [I-D.ietf- | |||
| opsawg-vmm-mib]. | ||||
| In split TS-NVE architecture, external NVE should keep any paused or | In the Split-NVE architecture, the external NVE should keep any | |||
| suspended VM in association as VM can return to running state at any | paused or suspended VM in association as the VM can return to Running | |||
| time. | state at any time. | |||
| 3. Hypervisor-to-NVE Signaling protocol functionality | 3. Hypervisor-to-NVE Control Plane Protocol Functionality | |||
| The following subsections show the illustrative examples of the state | The following subsections show the illustrative examples of the state | |||
| transitions on external NVE which are relevant to Hypervisor-to-NVE | transitions on external NVE which are relevant to Hypervisor-to-NVE | |||
| Signaling protocol functionality. It should be noted they are not | Signaling protocol functionality. It should be noted they are not | |||
| prescriptive text for full state machines. | prescriptive text for full state machines. | |||
| 3.1 VN connect and disconnect | 3.1 VN connect and Disconnect | |||
| When an NVE is external, a protocol is needed between the End Device | In Split-NVE scenario, a protocol is needed between the End | |||
| (e.g. Hypervisor) making use of the external NVE and the external NVE | Device(e.g. Hypervisor) making use of the external NVE and the | |||
| in order to make the NVE aware of the changing VN membership | external NVE in order to make the external NVE aware of the changing | |||
| requirements of the Tenant Systems within the End Device. | VN membership requirements of the Tenant Systems within the End | |||
| Device. | ||||
| A key driver for using a protocol rather than using static | A key driver for using a protocol rather than using static | |||
| configuration of the external NVE is because the VN connectivity | configuration of the external NVE is because the VN connectivity | |||
| requirements can change frequently as VMs are brought up, moved and | requirements can change frequently as VMs are brought up, moved and | |||
| brought down on various hypervisors throughout the data center. | brought down on various hypervisors throughout the data center or | |||
| external cloud. | ||||
| +---------------+ Recv VN_connect; +-------------------+ | +---------------+ Recv VN_connect; +-------------------+ | |||
| |VN_Disconnected| return Local_Tag value |VN_Connected | | |VN_Disconnected| return Local_Tag value |VN_Connected | | |||
| +---------------+ for VN if successful; +-------------------+ | +---------------+ for VN if successful; +-------------------+ | |||
| |VN_ID; |-------------------------->|VN_ID; | | |VN_ID; |-------------------------->|VN_ID; | | |||
| |VN_State= | |VN_State=connected;| | |VN_State= | |VN_State=connected;| | |||
| |disconnected; | |Num_TSI_Associated;| | |disconnected; | |Num_TSI_Associated;| | |||
| | |<----Recv VN_disconnect----|Local_Tag; | | | |<----Recv VN_disconnect----|Local_Tag; | | |||
| +---------------+ |VN_Context; | | +---------------+ |VN_Context; | | |||
| +-------------------+ | +-------------------+ | |||
| Figure 4 State Transition Summary of a VAP Instance | Figure 5 State Transition Example of a VAP Instance | |||
| on an External NVE | on an External NVE | |||
| Figure 4 show the state transition for a VAP on the external NVE. An | Figure 5 shows the state transition for a VAP on the external NVE. An | |||
| NVE that supports the hypervisor to NVE signaling protocol should | NVE that supports the hypervisor to NVE control plane protocol should | |||
| support one instance of the state machine for each active VN. The | support one instance of the state machine for each active VN. The | |||
| state transition on the external NVE is normally triggered by the | state transition on the external NVE is normally triggered by the | |||
| hypervisor-facing side events and behaviors. Some of the interleaved | hypervisor-facing side events and behaviors. Some of the interleaved | |||
| interaction between NVE and NVA will be illustrated for better | interaction between NVE and NVA will be illustrated for better | |||
| understanding of the whole procedures; while some of them may not be | understanding of the whole procedure; while others of them may not be | |||
| shown. More detailed information regarding that is available in [I- | shown. More detailed information regarding that is available in [I- | |||
| D.ietf-nvo3-nve-nva-cp-req]. | D.ietf-nvo3-nve-nva-cp-req]. | |||
| The NVE must be notified when an End Device requires connection to a | The external NVE must be notified when an End Device requires | |||
| particular VN and when it no longer requires connection. In addition, | connection to a particular VN and when it no longer requires | |||
| the external NVE must provide a local tag value for each connected VN | connection. In addition, the external NVE must provide a local tag | |||
| to the End Device to use for exchange of packets between the End | value for each connected VN to the End Device to use for exchange of | |||
| Device and the NVE (e.g. a locally significant 802.1Q tag value). How | packets between the End Device and the external NVE (e.g. a locally | |||
| "local" the significance is depends on whether the Hypervisor has a | significant 802.1Q tag value). How "local" the significance is | |||
| direct physical connection to the NVE (in which case the significance | depends on whether the Hypervisor has a direct physical connection to | |||
| is local to the physical link), or whether there is an Ethernet | the external NVE (in which case the significance is local to the | |||
| switch (e.g. a blade switch) connecting the Hypervisor to the NVE (in | physical link), or whether there is an Ethernet switch (e.g. a blade | |||
| which case the significance is local to the intervening switch and | switch) connecting the Hypervisor to the NVE (in which case the | |||
| all the links connected to it). | significance is local to the intervening switch and all the links | |||
| connected to it). | ||||
| These VLAN tags are used to differentiate between different VNs as | These VLAN tags are used to differentiate between different VNs as | |||
| packets cross the shared access network to the external NVE. When the | packets cross the shared access network to the external NVE. When the | |||
| NVE receives packets, it uses the VLAN tag to identify the VN of | external NVE receives packets, it uses the VLAN tag to identify the | |||
| packets coming from a given TSI, strips the tag, and adds the | VN of packets coming from a given TSI, strips the tag, and adds the | |||
| appropriate overlay encapsulation for that VN and send to the | appropriate overlay encapsulation for that VN and sends it towards | |||
| corresponding VAP. | the corresponding remote NVE across the underlying IP network. | |||
| The Identification of the VN in this protocol could either be through | The Identification of the VN in this protocol could either be through | |||
| a VN Name or a VN ID. A globally unique VN Name facilitates | a VN Name or a VN ID. A globally unique VN Name facilitates | |||
| portability of a Tenant's Virtual Data Center. Once an NVE receives a | portability of a Tenant's Virtual Data Center. Once an external NVE | |||
| VN connect indication, the NVE needs a way to get a VN Context | receives a VN connect indication, the NVE needs a way to get a VN | |||
| allocated (or receive the already allocated VN Context) for a given | Context allocated (or receive the already allocated VN Context) for a | |||
| VN Name or ID (as well as any other information needed to transmit | given VN Name or ID (as well as any other information needed to | |||
| encapsulated packets). How this is done is the subject of the NVE- | transmit encapsulated packets). How this is done is the subject of | |||
| to-NVA (called NVE-to-NVA in this document) protocol which are part | the NVE-to-NVA protocol which are part of work items 1 and 2 in | |||
| of work items 1 and 2 in [I-D.ietf-nvo3-overlay-problem-statement]. | [RFC7364]. | |||
| VN_connect message can be explicit or implicit. Explicit means the | VN_connect message can be explicit or implicit. Explicit means the | |||
| hypervisor sending a message explicitly to request for the connection | hypervisor sending a message explicitly to request for the connection | |||
| to a VN. Implicit means the external NVE receives other messages, | to a VN. Implicit means the external NVE receives other messages, | |||
| e.g. very first TSI associate message for a given VN as in next | e.g. very first TSI associate message (see the next subsection) for a | |||
| subsection, to implicitly indicate its interest to connect to a VN. | given VN, to implicitly indicate its interest to connect to a VN. | |||
| A VN_disconnect message will make NVE release all the resources for | A VN_disconnect message will indicate that the NVE can release all | |||
| that disconnected VN and transit to VN_disconnected state. The local | the resources for that disconnected VN and transit to VN_disconnected | |||
| tag assigned for that VN can possibly be reclaimed by other VN. | state. The local tag assigned for that VN can possibly be reclaimed | |||
| by other VN. | ||||
| 3.2 TSI associate and activate | 3.2 TSI Associate and Activate | |||
| Typically, a TSI is assigned a single MAC address and all frames | Typically, a TSI is assigned a single MAC address and all frames | |||
| transmitted and received on that TSI use that single MAC address. As | transmitted and received on that TSI use that single MAC address. As | |||
| mentioned earlier, it is also possible for a Tenant System to | mentioned earlier, it is also possible for a Tenant System to | |||
| exchange frames using multiple MAC addresses or packets with multiple | exchange frames using multiple MAC addresses or packets with multiple | |||
| IP addresses. | IP addresses. | |||
| Particularly in the case of a TS that is forwarding frames or packets | Particularly in the case of a TS that is forwarding frames or packets | |||
| from other TSs, the NVE will need to communicate the mapping between | from other TSs, the external NVE will need to communicate the mapping | |||
| the NVE's IP address (on the underlying network) and ALL the | between the NVE's IP address (on the underlying network) and ALL the | |||
| addresses the TS is forwarding on behalf of to NVA in each | addresses the TS is forwarding on behalf of for the corresponding VN | |||
| corresponding VN. | to the NVA. | |||
| The NVE has two ways in which it can discover the tenant addresses | The NVE has two ways in which it can discover the tenant addresses | |||
| for which frames must be forwarded to a given End Device (and | for which frames must be forwarded to a given End Device (and | |||
| ultimately to the TS within that End Device). | ultimately to the TS within that End Device). | |||
| 1. It can glean the addresses by inspecting the source addresses in | 1. It can glean the addresses by inspecting the source addresses in | |||
| packets it receives from the End Device. | packets it receives from the End Device. | |||
| 2. The hypervisor can explicitly signal the address associations of | 2. The hypervisor can explicitly signal the address associations of | |||
| a TSI to the external NVE. The address association includes all the | a TSI to the external NVE. The address association includes all the | |||
| MAC and/or IP addresses possibly used as source addresses in a packet | MAC and/or IP addresses possibly used as source addresses in a packet | |||
| sent from the hypervisor to external NVE. External NVE may further | sent from the hypervisor to external NVE. The external NVE may | |||
| use this information to filter the future traffic from the | further use this information to filter the future traffic from the | |||
| hypervisor. | hypervisor. | |||
| To perform the second approach above, the "hypervisor-to-NVE" | To perform the second approach above, the "hypervisor-to-NVE" | |||
| protocol requires a means to allow End Devices to communicate new | protocol requires a means to allow End Devices to communicate new | |||
| tenant addresses associations for a given TSI within a given VN. | tenant addresses associations for a given TSI within a given VN. | |||
| Figure 5 shows the state machine for a TSI connecting to a VAP on the | Figure 6 shows the example of a state transition for a TSI connecting | |||
| external NVE. An NVE that supports the hypervisor to NVE signaling | to a VAP on the external NVE. An NVE that supports the hypervisor to | |||
| protocol should support one instance of the state machine for each | NVE control plane protocol may support one instance of the state | |||
| TSI connecting to a given VN. | machine for each TSI connecting to a given VN. | |||
| disassociate; +--------+ | disassociate; +--------+ disassociate | |||
| +--------------->| Init |<--------clear-------+ | +--------------->| Init |<--------------------+ | |||
| |or keepalive +--------+ | | | +--------+ | | |||
| |timer timeout; | | | | | | | | | |||
| | | | | | | | | | | |||
| | +--------+ | | | +--------+ | | |||
| | | | | | | | | | | |||
| | associate | | activate | | | associate | | activate | | |||
| | +-----------+ +-----------+ | | | +-----------+ +-----------+ | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | \|/ \|/ | | | \|/ \|/ | | |||
| +--------------------+ +---------------------+ | +--------------------+ +---------------------+ | |||
| | Associated | | Activated | | | Associated | | Activated | | |||
| +--------------------+ +---------------------+ | +--------------------+ +---------------------+ | |||
| |TSI_ID; | |TSI_ID; | | |TSI_ID; | |TSI_ID; | | |||
| |Port; |-----activate---->|Port; | | |Port; |-----activate---->|Port; | | |||
| |VN_ID; | |VN_ID; | | |VN_ID; | |VN_ID; | | |||
| |State=associated; | |State=activated ; |-+ | |State=associated; | |State=activated ; |-+ | |||
| +-|Num_Of_Addr; |<---deactivate;---|Num_Of_Addr; | | | +-|Num_Of_Addr; |<---deactivate;---|Num_Of_Addr; | | | |||
| | |List_Of_Addr; | or keepactive List_Of_Addr; | | | | |List_Of_Addr; | |List_Of_Addr; | | | |||
| | |ResetKeepaliveTimer;| timer timeout; |ResetKeepactiveTimer;| | | | +--------------------+ +---------------------+ | | |||
| | +--------------------+ +---------------------+ | | | /|\ /|\ | | |||
| | /|\ /|\ | | | | | | | |||
| | | | | | +---------------------+ +-------------------+ | |||
| +---------------------+ +-------------------+ | add/remove/updt addr; add/remove/updt addr; | |||
| add/remove/updt addr; add/remove/updt addr; | or update port; or update port; | |||
| or update port; or or update port; or | ||||
| Recv keepalive pkt Recv keepactive pkt | ||||
| from TSI; or data msg from TSI; | ||||
| Figure 5 State Transition Summary of a TSI Instance | Figure 6 State Transition Example of a TSI Instance | |||
| on an External NVE | on an External NVE | |||
| Associated state of a TSI instance on an external NVE indicates all | Associated state of a TSI instance on an external NVE indicates all | |||
| the addresses for that TSI have already associated with the VAP of | the addresses for that TSI have already associated with the VAP of | |||
| the external NVE on port p for a given VN but no real traffic to and | the external NVE on port p for a given VN but no real traffic to and | |||
| from the TSI is expected and allowed to pass through. NVE has | from the TSI is expected and allowed to pass through. An NVE has | |||
| reserved all the necessary resources for that TSI. NVE may report the | reserved all the necessary resources for that TSI. An external NVE | |||
| mappings of NVE's underlay IP address and the associated TSI | may report the mappings of its' underlay IP address and the | |||
| addresses to NVA and relevant network nodes may save such information | associated TSI addresses to NVA and relevant network nodes may save | |||
| to its mapping table but not forwarding table. NVE may create ACL or | such information to its mapping table but not forwarding table. A NVE | |||
| filter rules based on the associated TSI addresses on the attached | may create ACL or filter rules based on the associated TSI addresses | |||
| port p but not enable them yet. Local tag for the VN corresponding to | on the attached port p but not enable them yet. Local tag for the VN | |||
| the TSI instance should be provisioned on port p to receive packets. | corresponding to the TSI instance should be provisioned on port p to | |||
| receive packets. | ||||
| VM migration discussed section 2 may cause the hypervisor send | VM migration event(discussed section 2) may cause the hypervisor to | |||
| associate message to the NVE connecting the destination hypervisor | send an associate message to the NVE connected to the destination | |||
| the VM migrates to. It is similar as the resource reservation request | hypervisor the VM migrates to. VM creation event may also lead to the | |||
| to make sure the VM can be successfully migrated later. If such | same practice. | |||
| association fails, VM may choose another destination hypervisor to | ||||
| migrate to or alert with an administrative message. VM creation event | ||||
| may also lead to the same practice. | ||||
| Activated state of a TSI instance on an external NVE indicates that | The Activated state of a TSI instance on an external NVE indicates | |||
| all the addresses for that TSI functioning correctly on port p and | that all the addresses for that TSI functioning correctly on port p | |||
| traffic can be received from and sent to that TSI on NVE. The | and traffic can be received from and sent to that TSI via the NVE. | |||
| mappings of NVE's underlay IP address and the associated TSI | The mappings of the NVE's underlay IP address and the associated TSI | |||
| addresses should be put into the forwarding table rather than the | addresses should be put into the forwarding table rather than the | |||
| mapping table on relevant network nodes. ACL or filter rules based on | mapping table on relevant network nodes. ACL or filter rules based on | |||
| the associated TSI addresses on the attached port p in NVE are | the associated TSI addresses on the attached port p in NVE are | |||
| enabled. Local tag for the VN corresponding to the TSI instance MUST | enabled. Local tag for the VN corresponding to the TSI instance MUST | |||
| be provisioned on port p to receive packets. | be provisioned on port p to receive packets. | |||
| Activate message makes the state transit from Init or Associated to | The Activate message makes the state transit from Init or Associated | |||
| Activated. VM creation, VM migration and VM resumption events | to Activated. VM creation, VM migration and VM resumption events | |||
| discussed in section 4 may trigger activate message to be sent from | discussed in section 4 may trigger the Activate message to be sent | |||
| the hypervisor to the external NVE. | from the hypervisor to the external NVE. | |||
| As mentioned in last subsection, associate or activate message from | ||||
| the very first TSI connecting to a VN on an NVE is also considered as | ||||
| the implicit VN_connect signal to create a VAP for that VN. | ||||
| TSI information may get updated either in Associated or Activated | TSI information may get updated either in Associated or Activated | |||
| state. Add or remove the associated addresses, update current | state. The following are considered updates to the TSI information: | |||
| associated addresses for example updating IP for a given MAC, update | add or remove the associated addresses, update current associated | |||
| NVE port information from which the message receives are all | addresses (for example updating IP for a given MAC), update NVE port | |||
| considered as TSI information updating. Such update does not change | information based on where the NVE receives messages. Such updates do | |||
| the state of TSI. When any address associated to a given TSI changes, | not change the state of TSI. When any address associated to a given | |||
| NVE should inform the NVA to update the mapping information on NVE's | TSI changes, the NVE should inform the NVA to update the mapping | |||
| underlying address and the associated TSI addresses. NVE should also | information on NVE's underlying address and the associated TSI | |||
| change its local ACL or filter settings accordingly for the relevant | addresses. The NVE should also change its local ACL or filter | |||
| addresses. Port information update will cause the local tag for the | settings accordingly for the relevant addresses. Port information | |||
| VN corresponding to the TSI instance provisioned on new port p and | update will cause the local tag for the VN corresponding to the TSI | |||
| removed from old port. | instance to be provisioned on new port p and removed from the old | |||
| port. | ||||
| NVE keeps a timer for each TSI instance associated or activated on | ||||
| it. When NVE receives the keepalive or keepactive message for a TSI | ||||
| instance, it should reset the timer. Keepactive timer may also be | ||||
| reset by receiving the data packet from any associated address of the | ||||
| corresponding TSI instance. Keepactive timer times out leads the | ||||
| state transiting from Activated to Associated. Keepalive timer times | ||||
| out leads the state transiting from Associated to Init. | ||||
| 3.3 TSI disassociate, deactivate and clear | 3.3 TSI Disassociate and Deactivate | |||
| Disassociate and deactivate conceptually are the reverse behaviors of | Disassociate and deactivate conceptually are the reverse behaviors of | |||
| associate and activate. From Activated state to Associated state, NVE | associate and activate. From Activated state to Associated state, the | |||
| needs to make sure the resources still reserved but the addresses | external NVE needs to make sure the resources are still reserved but | |||
| associated to the TSI not functioning and no traffic to and from the | the addresses associated to the TSI are not functioning and no | |||
| TSI is expected and allowed to pass through. For example, NVE needs | traffic to and from the TSI is expected and allowed to pass through. | |||
| to inform NVA to remove the relevant addresses mapping information | For example, the NVE needs to inform the NVA to remove the relevant | |||
| from forwarding or routing table. ACL or filtering rules regarding | addresses mapping information from forwarding or routing table. ACL | |||
| the relevant addresses should be disabled. From Associated or | or filtering rules regarding the relevant addresses should be | |||
| Activated state to Init state, NVE will release all the resource | disabled. From Associated or Activated state to the Init state, the | |||
| relevant to TSI instances. NVE should also inform the NVA to remove | NVE will release all the resources relevant to TSI instances. The NVE | |||
| the relevant entries from mapping table. ACL or filtering rules | should also inform the NVA to remove the relevant entries from | |||
| regarding the relevant addresses should be removed. Local tag | mapping table. ACL or filtering rules regarding the relevant | |||
| provisioning on the connecting port on NVE should be cleared. | addresses should be removed. Local tag provisioning on the connecting | |||
| port on NVE should be cleared. | ||||
| VM suspension discussed in section 2 may cause the relevant TSI | A VM suspension event(discussed in section 2) may cause the relevant | |||
| instance(s) on NVE transit from Activated to Associated state. VM | TSI instance(s) on the NVE to transit from Activated to Associated | |||
| pause normally does not affect the state of the relevant TSI | state. A VM pause event normally does not affect the state of the | |||
| instance(s) on NVE as the VM is expected to run again soon. VM | relevant TSI instance(s) on the NVE as the VM is expected to run | |||
| shutdown will cause the relevant TSI instance(s) on NVE transit to | again soon. The VM shutdown event will normally cause the relevant | |||
| Init state from Activated state. All resources should be released. | TSI instance(s) on NVE transit to Init state from Activated state. | |||
| All resources should be released. | ||||
| VM migration will lead the TSI instance on the source NVE to leave | A VM migration will lead the TSI instance on the source NVE to leave | |||
| Activated state. Such state transition on source NVE should not occur | Activated state. When a VM migrates to another hypervisor connecting | |||
| earlier than the TSI instance on the destination NVE transits to | to the same NVE, i.e. source and destination NVE are the same, NVE | |||
| Activated state. Otherwise traffic interruption may occur. When a VM | should use TSI_ID and incoming port to differentiate two TSI | |||
| migrates to another hypervisor connecting to the same NVE, i.e. | instance. | |||
| source and destination NVE are the same, NVE should use TSI_ID and | ||||
| incoming port to differentiate two TSI instance. | ||||
| Although the triggering messages for state transition shown in Figure | Although the triggering messages for state transition shown in Figure | |||
| 5 does not indicate the difference between VM creation/shutdown and | 6 does not indicate the difference between VM creation/shutdown event | |||
| VM migration arrival/departure, the NVE can make optimizations if it | and VM migration arrival/departure event, the external NVE can make | |||
| is notified of such information. For example, if NVE knows the | optimizations if it is notified of such information. For example, if | |||
| incoming activate message caused by migration rather than VM | the NVE knows the incoming activate message is caused by migration | |||
| creation, some mechanisms may be employed or triggered to make sure | rather than VM creation, some mechanisms may be employed or triggered | |||
| the dynamic configurations or provisionings on the destination NVE | to make sure the dynamic configurations or provisionings on the | |||
| same as those on the source NVE for the migrated VM, for example | destination NVE are the same as those on the source NVE for the | |||
| multicast group memberships. | migrated VM. For example IGMP query [RFC2236] can be triggered by the | |||
| destination external NVE to the migrated VM on destination hypervisor | ||||
| so that the VM is forced to answer an IGMP report to the multicast | ||||
| router. Then multicast router can correctly send the multicast | ||||
| traffic to the new external NVE for those multicast groups the VM had | ||||
| joined before the migration. | ||||
| 4. Hypervisor-to-NVE Signaling Protocol requirements | 4. Hypervisor-to-NVE Control Plane Protocol Requirements | |||
| Req-1: The protocol is able to run between the hypervisor and its | Req-1: The protocol MUST support a bridged network connecting End | |||
| associated external NVE which may directly connected or bridged in | Devices to External NVE. | |||
| split-NVE architecture. | ||||
| Req-2: The protocol MUST support the hypervisor initiating a request | Req-2: The protocol MUST support multiple End Devices sharing the | |||
| to its associated external NVE to be connected/disconnected to a | same External NVE via the same physical port across a bridged | |||
| given VN. | network. | |||
| Req-3: In response to the connection request to a given VN received | Req-3: The protocol MAY support an End Device using multiple external | |||
| on NVE's port p as per Req-1, the protocol SHOULD support NVE | NVEs simultaneously, but only one external NVE for each VN. | |||
| replying a locally significant tag assigned, for example 802.1Q tag | ||||
| value, to each of the VN it is member of. NVE should keep the record | ||||
| of VN ID, local tag assigned and port p triplet. | ||||
| Req-4: The protocol MUST support the hypervisor initiating a request | Req-4: The protocol MAY support an End Device using multiple external | |||
| to associate/disassociate, activate/deactive or clear address(es) of | NVEs simultaneously for the same VN. | |||
| a TSI instance to a VN on an NVE port. All requests should be | ||||
| logically consistent with text in section 5.2 & 5.3. | ||||
| Req-5: The protocol MUST support the hypervisor initiating a request | Req-5: The protocol MUST allow the End Device initiating a request to | |||
| to add, remove or update address(es) associated with a TSI instance | its associated External NVE to be connected/disconnected to a given | |||
| on the external NVE. Addresses can be expressed in different formats, | VN. | |||
| Req-6: The protocol MUST allow an External NVE initiating a request | ||||
| to its connected End Devices to be disconnected to a given VN. | ||||
| Req-7: When a TS attaches to a VN, the protocol MUST allow for an End | ||||
| Device and its external NVE to negotiate a locally-significant tag | ||||
| for carrying traffic associated with a specific VN (e.g., 802.1Q | ||||
| tags). | ||||
| Req-8: The protocol MUST allow an End Device initiating a request to | ||||
| associate/disassociate and/or activate/deactive address(es) of a TSI | ||||
| instance to a VN on an NVE port. | ||||
| Req-9: The protocol MUST allow the External NVE initiating a request | ||||
| to disassociate and/or deactivate address(es) of a TSI instance to a | ||||
| VN on an NVE port. | ||||
| Req-10: The protocol MUST allow an End Device initiating a request to | ||||
| add, remove or update address(es) associated with a TSI instance on | ||||
| the external NVE. Addresses can be expressed in different formats, | ||||
| for example, MAC, IP or pair of IP and MAC. | for example, MAC, IP or pair of IP and MAC. | |||
| Req-6: When any request of the protocol fails, a reason code MUST be | Req-11: The protocol MUST allow the External NVE to authenticate the | |||
| provided in the reply. | End Device connected. | |||
| Req-7: The protocol MAY support the hypervisor explicitly informing | Req-12: The protocol MUST be able to run over L2 links between the | |||
| NVE when a migration starts. It may help NVE to differentiate a new | End Device and its External NVE. | |||
| associated/activated TSI resulting from VM creation or VM migration. | ||||
| Req-8: The protocol SHOULD be extensible to carry more parameters to | Req-13: The protocol SHOULD support the End Device indicating if an | |||
| meet future requirements, for example, QoS settings. | associate or activate request from it results from a VM hot migration | |||
| event. | ||||
| There are multiple candidate protocols probably with some simple | VDP [IEEE 802.1Qbg] is a candidate protocol running on layer 2. | |||
| extensions that can be used as control plane protocol between | Appendix A illustrates VDP for reader's information. It requires | |||
| hypervisor and the external NVE. They include VDP [IEEE 802.1Qbg], | extensions to fulfill the requirements in this document. | |||
| LLDP, XMPP, and HTTP REST. Multiple factors influence the choice of | ||||
| protocol(s), for example, connection between hypervisor and external | ||||
| NVE is L2 or L3. Appendix A illustrates VDP for reader's information. | ||||
| 5. Security Considerations | 5. Security Considerations | |||
| NVEs must ensure that only properly authorized Tenant Systems are | NVEs must ensure that only properly authorized Tenant Systems are | |||
| allowed to join and become a part of any specific Virtual Network. In | allowed to join and become a part of any specific Virtual Network. In | |||
| addition, NVEs will need appropriate mechanisms to ensure that any | addition, NVEs will need appropriate mechanisms to ensure that any | |||
| hypervisor wishing to use the services of an NVE are properly | hypervisor wishing to use the services of an NVE are properly | |||
| authorized to do so. One design point is whether the hypervisor | authorized to do so. One design point is whether the hypervisor | |||
| should supply the NVE with necessary information (e.g., VM addresses, | should supply the NVE with necessary information (e.g., VM addresses, | |||
| VN information, or other parameters) that the NVE uses directly, or | VN information, or other parameters) that the NVE uses directly, or | |||
| skipping to change at page 15, line 24 ¶ | skipping to change at page 17, line 18 ¶ | |||
| No IANA action is required. RFC Editor: please delete this section | No IANA action is required. RFC Editor: please delete this section | |||
| before publication. | before publication. | |||
| 7. Acknowledgements | 7. Acknowledgements | |||
| This document was initiated and merged from the drafts draft-kreeger- | This document was initiated and merged from the drafts draft-kreeger- | |||
| nvo3-hypervisor-nve-cp, draft-gu-nvo3-tes-nve-mechanism and draft- | nvo3-hypervisor-nve-cp, draft-gu-nvo3-tes-nve-mechanism and draft- | |||
| kompella-nvo3-server2nve. Thanks to all the co-authors and | kompella-nvo3-server2nve. Thanks to all the co-authors and | |||
| contributing members of those drafts. | contributing members of those drafts. | |||
| The authors would like to specially thank Jon Hudson for his generous | ||||
| help in improving the readability of this document. | ||||
| 8. References | 8. References | |||
| 8.1 Normative References | 8.1 Normative References | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| 8.2 Informative References | 8.2 Informative References | |||
| [I-D.ietf-nvo3-overlay-problem-statement] Narten, T., Gray, E., | [RFC7364] Narten, T., Gray, E., Black, D., Fang, L., Kreeger, L., and | |||
| Black, D., Fang, L., Kreeger, L., and M. Napierala, | M. Napierala, "Problem Statement: Overlays for Network | |||
| "Problem Statement: Overlays for Network Virtualization", | Virtualization", October 2014. | |||
| draft-ietf-nvo3-overlay-problem-statement-04 (work in | ||||
| progress), July 2013. | ||||
| [I-D.ietf-nvo3-framework] Lasserre, M., Balus, F., Morin, T., Bitar, | [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. | |||
| N., and Y. Rekhter, "Framework for DC Network | Rekhter, "Framework for DC Network Virtualization", | |||
| Virtualization", draft-ietf-nvo3-framework-05 (work in | October 2014. | |||
| progress), January 2014. | ||||
| [I-D.ietf-nvo3-nve-nva-cp-req] Kreeger, L., Dutt, D., Narten, T., and | [I-D.ietf-nvo3-nve-nva-cp-req] Kreeger, L., Dutt, D., Narten, T., and | |||
| D. Black, "Network Virtualization NVE to NVA Control | D. Black, "Network Virtualization NVE to NVA Control | |||
| Protocol Requirements", draft-ietf-nvo3-nve-nva-cp-req-01 | Protocol Requirements", draft-ietf-nvo3-nve-nva-cp-req-01 | |||
| (work in progress), October 2013. | (work in progress), October 2013. | |||
| [I-D.ietf-nvo3-arch] Black, D., Narten, T., et al, "An Architecture | [I-D.ietf-nvo3-arch] Black, D., Narten, T., et al, "An Architecture | |||
| for Overlay Networks (NVO3)", draft-narten-nvo3-arch, work | for Overlay Networks (NVO3)", draft-narten-nvo3-arch, work | |||
| in progress. | in progress. | |||
| skipping to change at page 18, line 49 ¶ | skipping to change at page 20, line 49 ¶ | |||
| values is returned to the station via the VDP Response. The returned VID | values is returned to the station via the VDP Response. The returned VID | |||
| value can be a locally significant value. When GroupID is used, it is | value can be a locally significant value. When GroupID is used, it is | |||
| equivalent to the VN ID in NVO3. GroupID will be provided by the | equivalent to the VN ID in NVO3. GroupID will be provided by the | |||
| hypervisor to the bridge. The bridge will map GroupID to a locally | hypervisor to the bridge. The bridge will map GroupID to a locally | |||
| significant VLAN ID. | significant VLAN ID. | |||
| The VSIID in VDP request that identify a VM can be one of the following | The VSIID in VDP request that identify a VM can be one of the following | |||
| format: IPV4 address, IPV6 address, MAC address, UUID or locally | format: IPV4 address, IPV6 address, MAC address, UUID or locally | |||
| defined. | defined. | |||
| We compare VDP against the requirements in the following Figure A.6. It | ||||
| should be noted that the comparison is conceptual. Detail parameters | ||||
| checking is not performed. | ||||
| +------+-----------+----------------------------------------------+ | ||||
| | Req | VDP | remarks | | ||||
| | | supported?| | | ||||
| +------+-----------+----------------------------------------------+ | ||||
| | Req-1| partial |support directly connected but not bridged | | ||||
| +------+-----------+----------------------------------------------+ | ||||
| | Req-2| Yes |VN is represented by GroupID | | ||||
| +------+-----------+----------------------------------------------+ | ||||
| | Req-3| Yes |VID=NULL in request and bridge returns the | | ||||
| | | |assigned value in response | | ||||
| +------+-----------+------------------------+---------------------+ | ||||
| | | | requiments | VDP equivalence | | ||||
| | | +------------------------+---------------------+ | ||||
| | Req-4| partial | associate/disassociate| pre-asso/de-asso | | ||||
| | | | activate/deactivate | associate/nil | | ||||
| | | | clear | de-associate | | ||||
| +------+-----------+------------------------+---------------------+ | ||||
| | Req-5| partial | VDP can handle MAC addresses properly. For IP| | ||||
| | | | addresses, it is not clearly specified. | | ||||
| +------+-----------+----------------------------------------------+ | ||||
| | | | | | ||||
| | Req-6| Yes | Error type indicated in Status in response | | ||||
| +------+-----------+----------------------------------------------+ | ||||
| | Req-7| Yes | M bit indicated in Status in request | | ||||
| +------+-----------+----------------------------------------------+ | ||||
| | | | For certain information,e.g. new filter info | | ||||
| | Req-8| partial | format, VDP can easily be extended. For some,| | ||||
| | | | extensibility may be limited. | | ||||
| +------+-----------+----------------------------------------------+ | ||||
| Figure A.6 Compare VDP with the requirements | ||||
| Authors' Addresses | Authors' Addresses | |||
| Yizhou Li | Yizhou Li | |||
| Huawei Technologies | Huawei Technologies | |||
| 101 Software Avenue, | 101 Software Avenue, | |||
| Nanjing 210012 | Nanjing 210012 | |||
| China | China | |||
| Phone: +86-25-56625409 | Phone: +86-25-56625409 | |||
| EMail: liyizhou@huawei.com | EMail: liyizhou@huawei.com | |||
| Lucy Yong | Lucy Yong | |||
| End of changes. 85 change blocks. | ||||
| 463 lines changed or deleted | 483 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||