< draft-sridharan-virtualization-nvgre-07.txt   draft-sridharan-virtualization-nvgre-08.txt >
Network Working Group P. Garg Ed. Network Working Group P. Garg Ed.
Internet Draft Y. Wang Ed. Internet Draft Y. Wang Ed.
Intended Category: Informational Microsoft Intended Category: Informational Microsoft
Expires: May 10, 2015 November 11, 2014 Expires: October 12, 2015 April 13, 2015
NVGRE: Network Virtualization using Generic Routing Encapsulation NVGRE: Network Virtualization using Generic Routing Encapsulation
draft-sridharan-virtualization-nvgre-07.txt draft-sridharan-virtualization-nvgre-08.txt
Status of this Memo Status of this Memo
This memo provides information for the Internet Community. It does This memo provides information for the Internet Community. It does
not specify an Internet standard of any kind; instead it relies on a not specify an Internet standard of any kind; instead it relies on a
proposed standard. Distribution of this memo is unlimited. proposed standard. Distribution of this memo is unlimited.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
skipping to change at page 1, line 47 skipping to change at page 1, line 47
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html http://www.ietf.org/shadow.html
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with carefully, as they describe your rights and restrictions with
respect to this document. respect to this document.
This Internet-Draft will expire on May 10, 2015. This Internet-Draft will expire on October 12, 2015.
Abstract Abstract
This document describes the usage of Generic Routing Encapsulation This document describes the usage of Generic Routing Encapsulation
(GRE) header for Network Virtualization (NVGRE) in multi-tenant (GRE) header for Network Virtualization (NVGRE) in multi-tenant
datacenters. Network Virtualization decouples virtual networks and datacenters. Network Virtualization decouples virtual networks and
addresses from physical network infrastructure, providing isolation addresses from physical network infrastructure, providing isolation
and concurrency between multiple virtual networks on the same and concurrency between multiple virtual networks on the same
physical network infrastructure. This document also introduces a physical network infrastructure. This document also introduces a
Network Virtualization framework to illustrate the use cases, but Network Virtualization framework to illustrate the use cases, but
skipping to change at page 2, line 41 skipping to change at page 2, line 41
4.5. Address/Policy Management & Routing......................11 4.5. Address/Policy Management & Routing......................11
4.6. Cross-subnet, Cross-premise Communication................11 4.6. Cross-subnet, Cross-premise Communication................11
4.7. Internet Connectivity....................................13 4.7. Internet Connectivity....................................13
4.8. Management and Control Planes............................13 4.8. Management and Control Planes............................13
4.9. NVGRE-Aware Devices......................................13 4.9. NVGRE-Aware Devices......................................13
4.10. Network Scalability with NVGRE..........................14 4.10. Network Scalability with NVGRE..........................14
5. Security Considerations.......................................15 5. Security Considerations.......................................15
6. IANA Considerations...........................................15 6. IANA Considerations...........................................15
7. References....................................................15 7. References....................................................15
7.1. Normative References.....................................15 7.1. Normative References.....................................15
7.2. Informative References...................................15 7.2. Informative References...................................16
8. Authors and Contributors......................................16 8. Authors and Contributors......................................16
9. Acknowledgments...............................................17 9. Acknowledgments...............................................17
1. Introduction 1. Introduction
Conventional data center network designs cater to largely static Conventional data center network designs cater to largely static
workloads and cause fragmentation of network and server capacity workloads and cause fragmentation of network and server capacity
[5][6]. There are several issues that limit dynamic allocation and [6][7]. There are several issues that limit dynamic allocation and
consolidation of capacity. Layer-2 networks use Rapid Spanning Tree consolidation of capacity. Layer 2 networks use Rapid Spanning Tree
Protocol (RSTP) which is designed to eliminate loops by blocking Protocol (RSTP) which is designed to eliminate loops by blocking
redundant paths. These eliminated paths translate to wasted capacity redundant paths. These eliminated paths translate to wasted capacity
and a highly oversubscribed network. There are alternative and a highly oversubscribed network. There are alternative
approaches such as TRILL that address this problem [12]. approaches such as TRILL that address this problem [13].
The network utilization inefficiencies are exacerbated by network The network utilization inefficiencies are exacerbated by network
fragmentation due to the use of VLANs for broadcast isolation. VLANs fragmentation due to the use of VLANs for broadcast isolation. VLANs
are used for traffic management and also as the mechanism for are used for traffic management and also as the mechanism for
providing security and performance isolation among services providing security and performance isolation among services
belonging to different tenants. The Layer-2 network is carved into belonging to different tenants. The Layer 2 network is carved into
smaller sized subnets typically one subnet per VLAN, with VLAN tags smaller sized subnets typically one subnet per VLAN, with VLAN tags
configured on all the Layer-2 switches connected to server racks configured on all the Layer 2 switches connected to server racks
that host a given tenant's services. The current VLAN limits that host a given tenant's services. The current VLAN limits
theoretically allow for 4K such subnets to be created. The size of theoretically allow for 4K such subnets to be created. The size of
the broadcast domain is typically restricted due to the overhead of the broadcast domain is typically restricted due to the overhead of
broadcast traffic (e.g., ARP). The 4K VLAN limit is no longer broadcast traffic. The 4K VLAN limit is no longer sufficient in a
sufficient in a shared infrastructure servicing multiple tenants. shared infrastructure servicing multiple tenants.
Data center operators must be able to achieve high utilization of Data center operators must be able to achieve high utilization of
server and network capacity. In order to achieve efficiency it server and network capacity. In order to achieve efficiency it
should be possible to assign workloads that operate in a single should be possible to assign workloads that operate in a single
Layer-2 network to any server in any rack in the network. It should Layer 2 network to any server in any rack in the network. It should
also be possible to migrate workloads to any server anywhere in the also be possible to migrate workloads to any server anywhere in the
network while retaining the workloads' addresses. This can be network while retaining the workloads' addresses. This can be
achieved today by stretching VLANs, however when workloads migrate achieved today by stretching VLANs, however when workloads migrate
the network needs to be reconfigured which is typically error prone. the network needs to be reconfigured which is typically error prone.
By decoupling the workload's location on the LAN from its network By decoupling the workload's location on the LAN from its network
address, the network administrator configures the network once and address, the network administrator configures the network once and
not every time a service migrates. This decoupling enables any not every time a service migrates. This decoupling enables any
server to become part of any server resource pool. server to become part of any server resource pool.
The following are key design objectives for next generation data The following are key design objectives for next generation data
centers: centers:
a) location independent addressing a) location independent addressing
b) the ability to a scale the number of logical Layer-2/Layer-3 b) the ability to a scale the number of logical Layer 2/Layer 3
networks irrespective of the underlying physical topology or networks irrespective of the underlying physical topology or
the number of VLANs the number of VLANs
c) preserving Layer-2 semantics for services and allowing them to c) preserving Layer 2 semantics for services and allowing them to
retain their addresses as they move within and across data retain their addresses as they move within and across data
centers centers
d) providing broadcast isolation as workloads move around without d) providing broadcast isolation as workloads move around without
burdening the network control plane burdening the network control plane
This document describes the use of Generic Routing Encapsulation This document describes use of the Generic Routing Encapsulation
(GRE, [3][4]) header for network virtualization. Network (GRE, [3][4]) header for network virtualization. Network
virtualization decouples a virtual network from the underlying virtualization decouples a virtual network from the underlying
physical network infrastructure by virtualizing network addresses. physical network infrastructure by virtualizing network addresses.
Combined with a management and control plane for the virtual-to- Combined with a management and control plane for the virtual-to-
physical mapping, network virtualization can enable flexible VM physical mapping, network virtualization can enable flexible virtual
placement and movement, and provide network isolation for a multi- machine placement and movement, and provide network isolation for a
tenant datacenter. multi-tenant datacenter.
Network virtualization enables customers to bring their own address Network virtualization enables customers to bring their own address
spaces into a multi-tenant datacenter while the datacenter spaces into a multi-tenant datacenter while the datacenter
administrators can place the customer VMs anywhere in the datacenter administrators can place the customer virtual machines anywhere in
without reconfiguring their network switches or routers, the datacenter without reconfiguring their network switches or
irrespective of the customer address spaces. routers, irrespective of the customer address spaces.
1.1. Terminology 1.1. Terminology
Please refer to [8][10] for more formal definition of terminology. Please refer to [9][11] for more formal definition of terminology.
The following terms were used in this document. The following terms were used in this document.
Customer Address (CA): These are the virtual IP addresses assigned Customer Address (CA): These are the virtual IP addresses assigned
and configured on the virtual NIC within each VM. These are the only and configured on the virtual NIC within each VM. These are the only
addresses visible to VMs and applications running within VMs. addresses visible to VMs and applications running within VMs.
NVE: Network Virtualization Edge, the entity that performs the Network Virtualization Edge (NVE): An entity that performs the
network virtualization encapsulation and decapsulation. network virtualization encapsulation and decapsulation.
Provider Address (PA): These are the IP addresses used in the Provider Address (PA): These are the IP addresses used in the
physical network. PA's are associated with VM CA's through the physical network. PA's are associated with VM CA's through the
network virtualization mapping policy. network virtualization mapping policy.
VM: Virtual Machine. Virtual machines are typically instances of Virtual Machine (VM): These are instances of OS's running on top of
OS's running on top of hypervisor over a physical machine or server. hypervisor over a physical machine or server. Multiple VMs can share
Multiple VMs can share the same physical server via the hypervisor, the same physical server via the hypervisor, yet are completely
yet are completely isolated from each other in terms of compute, isolated from each other in terms of compute, storage, and other OS
storage, and other OS resources. resources.
VSID: Virtual Subnet Identifier, a 24-bit ID that uniquely Virtual Subnet Identifier (VSID): a 24-bit ID that uniquely
identifies a virtual subnet or virtual layer 2 broadcast domain. identifies a virtual subnet or virtual layer 2 broadcast domain.
2. Conventions used in this document 2. Conventions used in this document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC-2119 [RFC2119]. document are to be interpreted as described in RFC-2119 [RFC2119].
In this document, these words will appear with that interpretation In this document, these words will appear with that interpretation
only when in ALL CAPS. Lower case uses of these words are not to be only when in ALL CAPS. Lower case uses of these words are not to be
interpreted as carrying RFC-2119 significance. interpreted as carrying RFC-2119 significance.
3. NVGRE: Network Virtualization using GRE 3. Network Virtualization using GRE (NVGRE)
This section describes Network Virtualization using GRE, NVGRE. This section describes Network Virtualization using GRE, NVGRE.
Network virtualization involves creating virtual Layer 2 topologies Network virtualization involves creating virtual Layer 2 topologies
on top of a physical Layer 3 network. Connectivity in the virtual on top of a physical Layer 3 network. Connectivity in the virtual
topology is provided by tunneling Ethernet frames in GRE over the topology is provided by tunneling Ethernet frames in GRE over IP
physical network. over the physical network.
In NVGRE, every virtual Layer-2 network is associated with a 24-bit In NVGRE, every virtual Layer 2 network is associated with a 24-bit
identifier, called Virtual Subnet Identifier (VSID). VSID is carried identifier, called a Virtual Subnet Identifier (VSID). A VSID is
in an outer header as defined in Section 3.2. , allowing unique carried in an outer header as defined in Section 3.2. , allowing
identification of a tenant's virtual subnet to various devices in unique identification of a tenant's virtual subnet to various
the network. A 24-bit VSID supports up to 16 million virtual subnets devices in the network. A 24-bit VSID supports up to 16 million
in the same management domain, in contrast to only 4K achievable virtual subnets in the same management domain, in contrast to only
with VLANs. Each VSID represents a virtual Layer-2 broadcast domain, 4K achievable with VLANs. Each VSID represents a virtual Layer 2
which can be used to identify a virtual subnet of a given tenant. To broadcast domain, which can be used to identify a virtual subnet of
support multi-subnet virtual topology, datacenter administrators can a given tenant. To support multi-subnet virtual topology, datacenter
configure routes to facilitate communication between virtual subnets administrators can configure routes to facilitate communication
of the same tenant. between virtual subnets of the same tenant.
GRE is a proposed IETF standard [3][4] and provides a way for GRE is a proposed IETF standard [3][4] and provides a way for
encapsulating an arbitrary protocol over IP. NVGRE leverages the GRE encapsulating an arbitrary protocol over IP. NVGRE leverages the GRE
header to carry VSID information in each packet. The VSID header to carry VSID information in each packet. The VSID
information in each packet can be used to build multi-tenant-aware information in each packet can be used to build multi-tenant-aware
tools for traffic analysis, traffic inspection, and monitoring. tools for traffic analysis, traffic inspection, and monitoring.
The following sections detail the packet format for NVGRE, describe The following sections detail the packet format for NVGRE, describe
the functions of a NVGRE endpoint, illustrate typical traffic flow the functions of a NVGRE endpoint, illustrate typical traffic flow
both within and across data centers, and discuss address, policy both within and across data centers, and discuss address, policy
management and deployment considerations. management and deployment considerations.
3.1. NVGRE Endpoint 3.1. NVGRE Endpoint
NVGRE endpoints are the ingress/egress points between the virtual NVGRE endpoints are the ingress/egress points between the virtual
and the physical networks. The NVGRE endpoints are the NVEs as and the physical networks. The NVGRE endpoints are the NVEs as
defined in the NVO Framework document [8]. Any physical server or defined in the NVO Framework document [9]. Any physical server or
network device can be an NVGRE endpoint. One common deployment is network device can be an NVGRE endpoint. One common deployment is
for the endpoint to be part of a hypervisor. The primary function of for the endpoint to be part of a hypervisor. The primary function of
this endpoint is to encapsulate/decapsulate Ethernet data frames to this endpoint is to encapsulate/decapsulate Ethernet data frames to
and from the GRE tunnel, ensure Layer-2 semantics, and apply and from the GRE tunnel, ensure Layer 2 semantics, and apply
isolation policy scoped on VSID. The endpoint can optionally isolation policy scoped on VSID. The endpoint can optionally
participate in routing and function as a gateway in the virtual participate in routing and function as a gateway in the virtual
topology. To encapsulate an Ethernet frame, the endpoint needs to topology. To encapsulate an Ethernet frame, the endpoint needs to
know the location information for the destination address in the know the location information for the destination address in the
frame. This information can be provisioned via a management plane, frame. This information can be provisioned via a management plane,
or obtained via a combination of control plane distribution or data or obtained via a combination of control plane distribution or data
plane learning approaches. This document assumes that the location plane learning approaches. This document assumes that the location
information, including VSID, is available to the NVGRE endpoint. information, including VSID, is available to the NVGRE endpoint.
3.2. NVGRE frame format 3.2. NVGRE frame format
GRE header format as specified in RFC 2784 and RFC 2890 [3][4] is The GRE header format as specified in RFC 2784 and RFC 2890 [3][4]
used for communication between NVGRE endpoints. NVGRE leverages the is used for communication between NVGRE endpoints. NVGRE leverages
Key extension specified in RFC 2890 to carry the VSID. The packet the Key extension specified in RFC 2890 [4] to carry the VSID. The
format for Layer-2 encapsulation in GRE is shown in Figure 1. packet format for Layer 2 encapsulation in GRE is shown in Figure 1.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
Outer Ethernet Header: | Outer Ethernet Header: |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Outer) Destination MAC Address | | (Outer) Destination MAC Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|(Outer)Destination MAC Address | (Outer)Source MAC Address | |(Outer)Destination MAC Address | (Outer)Source MAC Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (Outer) Source MAC Address | | (Outer) Source MAC Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at page 9, line 10 skipping to change at page 9, line 10
The GRE header: The GRE header:
o The C (Checksum Present) and S (Sequence Number Present) bits in o The C (Checksum Present) and S (Sequence Number Present) bits in
the GRE header MUST be zero. the GRE header MUST be zero.
o The K bit (Key Present) in the GRE header MUST be set to one. The o The K bit (Key Present) in the GRE header MUST be set to one. The
32-bit Key field in the GRE header is used to carry the Virtual 32-bit Key field in the GRE header is used to carry the Virtual
Subnet ID (VSID), and the FlowId: Subnet ID (VSID), and the FlowId:
- Virtual Subnet ID (VSID): This is a 24-bit value that is used - Virtual Subnet ID (VSID): This is a 24-bit value that is used
to identify the NVGRE based Virtual Layer-2 Network. to identify the NVGRE based Virtual Layer 2 Network.
- FlowID: This is an 8-bit value that is used to provide per- - FlowID: This is an 8-bit value that is used to provide per-
flow entropy for flows in the same VSID. The FlowID MUST NOT flow entropy for flows in the same VSID. The FlowID MUST NOT
be modified by transit devices. The encapsulating NVE SHOULD be modified by transit devices. The encapsulating NVE SHOULD
provide as much entropy as possible in the FlowId. If a FlowID provide as much entropy as possible in the FlowId. If a FlowID
is not generated, it MUST be set to all zero. is not generated, it MUST be set to all zero.
o The protocol type field in the GRE header is set to 0x6558 o The protocol type field in the GRE header is set to 0x6558
(transparent Ethernet bridging)[2]. (transparent Ethernet bridging)[2].
The inner headers (headers of the GRE payload): The inner headers (headers of the GRE payload):
skipping to change at page 10, line 5 skipping to change at page 10, line 5
3.4. Reserved VSID 3.4. Reserved VSID
The VSID range from 0-0xFFF is reserved for future use. The VSID range from 0-0xFFF is reserved for future use.
The VSID 0xFFFFFF is reserved for vendor specific NVE-NVE The VSID 0xFFFFFF is reserved for vendor specific NVE-NVE
communication. The sender NVE SHOULD verify receiver NVE's vendor communication. The sender NVE SHOULD verify receiver NVE's vendor
before sending a packet using this VSID, however such verification before sending a packet using this VSID, however such verification
mechanism is out of scope of this document. Implementations SHOULD mechanism is out of scope of this document. Implementations SHOULD
choose a mechanism that meets their requirements. choose a mechanism that meets their requirements.
4. NVGRE Deployment Consideration 4. NVGRE Deployment Considerations
4.1. ECMP Support 4.1. ECMP Support
The switches and routers SHOULD provide ECMP on NVGRE packets using ECMP may be used to provide load balancing. If ECMP is used, it is
the outer frame fields and entire Key field (32-bit). RECOMMENDED that ECMP hash is calculated either using the outer IP
frame fields and entire Key field (32-bit) or inner IP and transport
frame fields.
4.2. Broadcast and Multicast Traffic 4.2. Broadcast and Multicast Traffic
To support broadcast and multicast traffic inside a virtual subnet, To support broadcast and multicast traffic inside a virtual subnet,
one or more administratively scoped multicast addresses [7][9] can one or more administratively scoped multicast addresses [8][10] can
be assigned for the VSID. All multicast or broadcast traffic be assigned for the VSID. All multicast or broadcast traffic
originating from within a VSID is encapsulated and sent to the originating from within a VSID is encapsulated and sent to the
assigned multicast address. From an administrative standpoint it is assigned multicast address. From an administrative standpoint it is
possible for network operators to configure a PA multicast address possible for network operators to configure a PA multicast address
for each multicast address that is used inside a VSID, to facilitate for each multicast address that is used inside a VSID, to facilitate
optimal multicast handling. Depending on the hardware capabilities optimal multicast handling. Depending on the hardware capabilities
of the physical network devices and the physical network of the physical network devices and the physical network
architecture, multiple virtual subnet may re-use the same physical architecture, multiple virtual subnet may re-use the same physical
IP multicast address. IP multicast address.
Alternatively, based upon the configuration at NVE, the broadcast Alternatively, based upon the configuration at NVE, the broadcast
and multicast in the virtual subnet can be supported using N-Way and multicast in the virtual subnet can be supported using N-Way
unicast. In N-Way unicast, the sender NVE would send one unicast. In N-Way unicast, the sender NVE would send one
encapsulated packet to every NVE in the virtual subnet. The sender encapsulated packet to every NVE in the virtual subnet. The sender
NVE can encapsulate and send the packet as described in the Unicast NVE can encapsulate and send the packet as described in the Unicast
Traffic Section 4.3. This alleviates the need for multicast support Traffic Section 4.3. This alleviates the need for multicast support
in the physical network. in the physical network.
4.3. Unicast Traffic 4.3. Unicast Traffic
The NVGRE endpoint encapsulates a Layer-2 packet in GRE using the The NVGRE endpoint encapsulates a Layer 2 packet in GRE using the
source PA associated with the endpoint with the destination PA source PA associated with the endpoint with the destination PA
corresponding to the location of the destination endpoint. As corresponding to the location of the destination endpoint. As
outlined earlier, there can be one or more PAs associated with an outlined earlier, there can be one or more PAs associated with an
endpoint and policy will control which ones get used for endpoint and policy will control which ones get used for
communication. The encapsulated GRE packet is bridged and routed communication. The encapsulated GRE packet is bridged and routed
normally by the physical network to the destination PA. Bridging normally by the physical network to the destination PA. Bridging
uses the outer Ethernet encapsulation for scope on the LAN. The only uses the outer Ethernet encapsulation for scope on the LAN. The only
requirement is bi-directional IP connectivity from the underlying requirement is bi-directional IP connectivity from the underlying
physical network. On the destination, the NVGRE endpoint physical network. On the destination, the NVGRE endpoint
decapsulates the GRE packet to recover the original Layer-2 frame. decapsulates the GRE packet to recover the original Layer 2 frame.
Traffic flows similarly on the reverse path. Traffic flows similarly on the reverse path.
4.4. IP Fragmentation 4.4. IP Fragmentation
RFC 2003 [11] Section 5.1 specifies mechanisms for handling RFC 2003 [12] Section 5.1 specifies mechanisms for handling
fragmentation when encapsulating IP within IP. The subset of fragmentation when encapsulating IP within IP. The subset of
mechanisms NVGRE selects are intended to ensure that NVGRE mechanisms NVGRE selects are intended to ensure that NVGRE
encapsulated frames are not fragmented after encapsulation en-route encapsulated frames are not fragmented after encapsulation en-route
to the destination NVGRE endpoint, and that traffic sources can to the destination NVGRE endpoint, and that traffic sources can
leverage Path MTU discovery. A future version of this draft will leverage Path MTU discovery.
clarify the details around setting the DF bit on the outer IP header
as well as maintaining per destination NVGRE endpoint MTU soft state A sender NVE MUST NOT fragment NVGRE packets. A receiver NVE MAY
so that ICMP Datagram Too Big messages can be exploited. discard fragmented NVGRE packets. It is RECOMMENDED that MTU of
Fragmentation behavior when tunneling non-IP Ethernet frames in GRE physical network accommodates the larger frame size due to
will also be specified in a future version. encapsulation. Path MTU or configuration via control plane can be
used to meet this requirement.
4.5. Address/Policy Management & Routing 4.5. Address/Policy Management & Routing
Address acquisition is beyond the scope of this document and can be Address acquisition is beyond the scope of this document and can be
obtained statically, dynamically or using stateless address auto- obtained statically, dynamically or using stateless address auto-
configuration. CA and PA space can be either IPv4 or IPv6. In fact configuration. CA and PA space can be either IPv4 or IPv6. In fact
the address families don't have to match, for example, a CA can be the address families don't have to match, for example, a CA can be
IPv4 while the PA is IPv6 and vice versa. IPv4 while the PA is IPv6 and vice versa.
4.6. Cross-subnet, Cross-premise Communication 4.6. Cross-subnet, Cross-premise Communication
skipping to change at page 13, line 43 skipping to change at page 13, line 43
4.8. Management and Control Planes 4.8. Management and Control Planes
There are several protocols that can manage and distribute policy; There are several protocols that can manage and distribute policy;
however, it is out of scope of this document. Implementations SHOULD however, it is out of scope of this document. Implementations SHOULD
choose a mechanism that meets their scale requirements. choose a mechanism that meets their scale requirements.
4.9. NVGRE-Aware Devices 4.9. NVGRE-Aware Devices
One example of a typical deployment consists of virtualized servers One example of a typical deployment consists of virtualized servers
deployed across multiple racks connected by one or more layers of deployed across multiple racks connected by one or more layers of
Layer-2 switches which in turn may be connected to a layer 3 routing Layer 2 switches which in turn may be connected to a layer 3 routing
domain. Even though routing in the physical infrastructure will work domain. Even though routing in the physical infrastructure will work
without any modification with NVGRE, devices that perform without any modification with NVGRE, devices that perform
specialized processing in the network need to be able to parse GRE specialized processing in the network need to be able to parse GRE
to get access to tenant specific information. Devices that to get access to tenant specific information. Devices that
understand and parse the VSID can provide rich multi-tenancy aware understand and parse the VSID can provide rich multi-tenancy aware
services inside the data center. As outlined earlier it is services inside the data center. As outlined earlier it is
imperative to exploit multiple paths inside the network through imperative to exploit multiple paths inside the network through
techniques such as Equal Cost Multipath (ECMP). The Key field (32- techniques such as Equal Cost Multipath (ECMP). The Key field (32-
bit field, including both VSID and the optional FlowID) can provide bit field, including both VSID and the optional FlowID) can provide
additional entropy to the switches to exploit path diversity inside additional entropy to the switches to exploit path diversity inside
skipping to change at page 15, line 4 skipping to change at page 15, line 4
and in turn MAC address table scalability that can be achieved. and in turn MAC address table scalability that can be achieved.
NVGRE endpoint can use one PA to represent multiple CAs. This lowers NVGRE endpoint can use one PA to represent multiple CAs. This lowers
the burden on the MAC address table sizes at the Top of Rack the burden on the MAC address table sizes at the Top of Rack
switches. One obvious benefit is in the context of server switches. One obvious benefit is in the context of server
virtualization which has increased the demands on the network virtualization which has increased the demands on the network
infrastructure. By embedding a NVGRE endpoint in a hypervisor it is infrastructure. By embedding a NVGRE endpoint in a hypervisor it is
possible to scale significantly. This framework allows for location possible to scale significantly. This framework allows for location
information to be preconfigured inside a NVGRE endpoint allowing information to be preconfigured inside a NVGRE endpoint allowing
broadcast ARP traffic to be proxied locally. This approach can scale broadcast ARP traffic to be proxied locally. This approach can scale
to large sized virtual subnets. These virtual subnets can be spread to large sized virtual subnets. These virtual subnets can be spread
across multiple layer-3 physical subnets. It allows workloads to be across multiple layer 3 physical subnets. It allows workloads to be
moved around without imposing a huge burden on the network control moved around without imposing a huge burden on the network control
plane. By eliminating most broadcast traffic and converting others plane. By eliminating most broadcast traffic and converting others
to multicast the routers and switches can function more efficiently to multicast the routers and switches can function more efficiently
by building efficient multicast trees. By using server and network by building efficient multicast trees. By using server and network
capacity efficiently it is possible to drive down the cost of capacity efficiently it is possible to drive down the cost of
building and managing data centers. building and managing data centers.
5. Security Considerations 5. Security Considerations
This proposal extends the Layer-2 subnet across the data center and This proposal extends the Layer 2 subnet across the data center and
increases the scope for spoofing attacks. Mitigations of such increases the scope for spoofing attacks. Mitigations of such
attacks are possible with authentication/encryption using IPsec or attacks are possible with authentication/encryption using IPsec or
any other IP based mechanism. The control plane for policy any other IP based mechanism. The control plane for policy
distribution is expected to be secured by using any of the existing distribution is expected to be secured by using any of the existing
security protocols. Further management traffic can be isolated in a security protocols. Further management traffic can be isolated in a
separate subnet/VLAN. separate subnet/VLAN.
The checksum in the GRE header is not supported. The mitigation of The checksum in the GRE header is not supported. The mitigation of
this is to deploy NVGRE based solution in a network that provides this is to deploy NVGRE based solution in a network that provides
error detection along the NVGRE packet path, for example, using error detection along the NVGRE packet path, for example, using
skipping to change at page 15, line 47 skipping to change at page 15, line 47
[2] Ethertypes, ftp://ftp.isi.edu/in- [2] Ethertypes, ftp://ftp.isi.edu/in-
notes/iana/assignments/ethernet-numbers notes/iana/assignments/ethernet-numbers
[3] D. Farinacci et al, "Generic Routing Encapsulation (GRE)", RFC [3] D. Farinacci et al, "Generic Routing Encapsulation (GRE)", RFC
2784, March, 2000. 2784, March, 2000.
[4] G. Dommety, "Key and Sequence Number Extensions to GRE", RFC [4] G. Dommety, "Key and Sequence Number Extensions to GRE", RFC
2890, September 2000. 2890, September 2000.
[5] Institute of Electrical and Electronics Engineers, "Virtual
Bridged Local Area Networks", IEEE Standard 802.1Q, 2005
Edition, May 2006.
7.2. Informative References 7.2. Informative References
[5] A. Greenberg et al, "VL2: A Scalable and Flexible Data Center [6] A. Greenberg et al, "VL2: A Scalable and Flexible Data Center
Network", Proc. SIGCOMM 2009. Network", Proc. SIGCOMM 2009.
[6] A. Greenberg et al, "The Cost of a Cloud: Research Problems in [7] A. Greenberg et al, "The Cost of a Cloud: Research Problems in
the Data Center", ACM SIGCOMM Computer Communication Review. the Data Center", ACM SIGCOMM Computer Communication Review.
[7] B. Hinden, S. Deering, "IP Version 6 Addressing Architecture", [8] B. Hinden, S. Deering, "IP Version 6 Addressing Architecture",
RFC 4291, February 2006. RFC 4291, February 2006.
[8] M. Lasserre et al, "Framework for DC Network Virtualization", [9] M. Lasserre et al, "Framework for DC Network Virtualization",
draft-ietf-nov3-framework (work in progress), July 2014. RFC 7365, October 2014.
[9] D. Meyer, "Administratively Scoped IP Multicast", BCP 23, RFC [10] D. Meyer, "Administratively Scoped IP Multicast", BCP 23, RFC
2365, July 1998. 2365, July 1998.
[10] T. Narten et al, "Problem Statement: Overlays for Network [11] T. Narten et al, "Problem Statement: Overlays for Network
Virtualization", draft-ietf-nvo3-overlay-problem-statement Virtualization", RFC 7364, October 2014.
(work in progress), July 2013.
[11] C. Perkins, "IP Encapsulation within IP", RFC 2003, October [12] C. Perkins, "IP Encapsulation within IP", RFC 2003, October
1996. 1996.
[12] J. Touch, R. Perlman, "Transparent Interconnection of Lots of [13] J. Touch, R. Perlman, "Transparent Interconnection of Lots of
Links (TRILL): Problem and Applicability Statement", RFC 5556, Links (TRILL): Problem and Applicability Statement", RFC 5556,
May 2009. May 2009.
8. Authors and Contributors 8. Authors and Contributors
M. Sridharan M. Sridharan
A. Greenberg A. Greenberg
Y. Wang Y. Wang
P. Garg P. Garg
N. Venkataramiah N. Venkataramiah
 End of changes. 45 change blocks. 
77 lines changed or deleted 83 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/