| < draft-ietf-nvo3-encap-05.txt | draft-ietf-nvo3-encap-06.txt > | |||
|---|---|---|---|---|
| NVO3 Workgroup S. Boutros, Ed. | NVO3 Workgroup S. Boutros, Ed. | |||
| Internet-Draft Ciena | Internet-Draft Ciena | |||
| Intended status: Informational February 17, 2020 | Intended Status: Informational D. Eastlake, Ed. | |||
| Expires: August 20, 2020 | Futurewei | |||
| Expires: December 8, 2021 June 9, 2021 | ||||
| NVO3 Encapsulation Considerations | NVO3 Encapsulation Considerations | |||
| draft-ietf-nvo3-encap-05 | draft-ietf-nvo3-encap-06 | |||
| Abstract | Abstract | |||
| As communicated by the WG Chairs, the IETF NVO3 chairs and Routing | ||||
| As communicated by WG Chairs, the IETF NVO3 chairs and Routing Area | Area director have chartered a design team to take forward the | |||
| director have chartered a design team to take forward the | ||||
| encapsulation discussion and see if there is potential to design a | encapsulation discussion and see if there is potential to design a | |||
| common encapsulation that addresses the various technical concerns. | common encapsulation that addresses the various technical concerns. | |||
| There are implications of different encapsulations in real | There are implications of different encapsulations in real | |||
| environments consisting of both software and hardware implementations | environments consisting of both software and hardware implementations | |||
| and spanning multiple data centers. For example, OAM functions such | and spanning multiple data centers. For example, OAM functions such | |||
| as path MTU discovery become challenging with multiple encapsulations | as path MTU discovery become challenging with multiple encapsulations | |||
| along the data path. | along the data path. | |||
| The design team recommend Geneve with few modifications as the common | The design team recommends Geneve with a few modifications as the | |||
| encapsulation, more details are described in section 7. | common encapsulation. This document provides more details, | |||
| particularly in Section 7. | ||||
| Status of This Memo | Status of This Document | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Distribution of this document is unlimited. Comments should be sent | ||||
| to the authors or the IDR Working Group mailing list <nvo3@ietf.org>. | ||||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF), its areas, and its working groups. Note that | |||
| working documents as Internet-Drafts. The list of current Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on August 20, 2020. | The list of current Internet-Drafts can be accessed at | |||
| https://www.ietf.org/1id-abstracts.html. The list of Internet-Draft | ||||
| Shadow Directories can be accessed at | ||||
| https://www.ietf.org/shadow.html. | ||||
| Copyright Notice | Copyright Notice | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| Copyright (c) 2020 IETF Trust and the persons identified as the | Copyright (c) 2021 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (https://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| described in the Simplified BSD License. | described in the Simplified BSD License. | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| Table of Contents | Table of Contents | |||
| 1. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction............................................4 | |||
| 2. Design Team Goals . . . . . . . . . . . . . . . . . . . . . . 3 | 2. Design Team Goals.......................................4 | |||
| 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 3. Terminology.............................................5 | |||
| 4. Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . 4 | 4. Abbreviations and Acronyms..............................5 | |||
| 5. Issues with current Encapsulations . . . . . . . . . . . . . 4 | ||||
| 5.1. Geneve . . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
| 5.2. GUE . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | ||||
| 5.3. VXLAN-GPE . . . . . . . . . . . . . . . . . . . . . . . . 5 | ||||
| 6. Common Encapsulation Considerations . . . . . . . . . . . . . 5 | ||||
| 6.1. Current Encapsulations . . . . . . . . . . . . . . . . . 5 | ||||
| 6.2. Useful Extensions Use cases . . . . . . . . . . . . . . . 5 | ||||
| 6.2.1. Telemetry extensions. . . . . . . . . . . . . . . . . 6 | ||||
| 6.2.2. Security/Integrity extensions . . . . . . . . . . . . 6 | ||||
| 6.2.3. Group Base Policy . . . . . . . . . . . . . . . . . . 7 | ||||
| 6.3. Hardware Considerations . . . . . . . . . . . . . . . . . 7 | ||||
| 6.4. Extension Size . . . . . . . . . . . . . . . . . . . . . 7 | ||||
| 6.5. Extension Ordering . . . . . . . . . . . . . . . . . . . 8 | ||||
| 6.6. TLV vs Bit Fields . . . . . . . . . . . . . . . . . . . . 8 | ||||
| 6.7. Control Plane Considerations . . . . . . . . . . . . . . 9 | ||||
| 6.8. Split NVE . . . . . . . . . . . . . . . . . . . . . . . . 10 | ||||
| 6.9. Larger VNI Considerations . . . . . . . . . . . . . . . . 11 | ||||
| 7. Design team recommendations . . . . . . . . . . . . . . . . . 11 | ||||
| 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13 | ||||
| 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14 | ||||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 | ||||
| 11. Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . 14 | ||||
| 11.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 14 | ||||
| 11.2. Extensibility . . . . . . . . . . . . . . . . . . . . . 14 | ||||
| 11.2.1. Native Extensibility Support . . . . . . . . . . . . 14 | ||||
| 11.2.2. Extension Parsing . . . . . . . . . . . . . . . . . 14 | ||||
| 11.2.3. Critical Extensions . . . . . . . . . . . . . . . . 15 | ||||
| 11.2.4. Maximal Header Length . . . . . . . . . . . . . . . 15 | ||||
| 11.3. Encapsulation Header . . . . . . . . . . . . . . . . . . 15 | ||||
| 11.3.1. Virtual Network Identifier (VNI) . . . . . . . . . . 15 | ||||
| 11.3.2. Next Protocol . . . . . . . . . . . . . . . . . . . 15 | ||||
| 11.3.3. Other Header Fields . . . . . . . . . . . . . . . . 16 | ||||
| 11.4. Comparison Summary . . . . . . . . . . . . . . . . . . . 16 | 5. Issues with Current Encapsulations......................6 | |||
| 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 17 | 5.1. Geneve................................................6 | |||
| 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 | 5.2. GUE...................................................6 | |||
| 13.1. Normative References . . . . . . . . . . . . . . . . . . 18 | 5.3. VXLAN-GPE.............................................6 | |||
| 13.2. Informative References . . . . . . . . . . . . . . . . . 18 | ||||
| Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
| 1. Problem Statement | 6. Common Encapsulation Considerations.....................7 | |||
| 6.1. Current Encapsulations................................7 | ||||
| 6.2. Useful Extensions Use Cases...........................7 | ||||
| 6.2.1. Telemetry Extensions................................7 | ||||
| 6.2.2. Security/Integrity Extensions.......................8 | ||||
| 6.2.3 Group Base Policy....................................8 | ||||
| 6.3. Hardware Considerations...............................9 | ||||
| 6.4. Extension Size........................................9 | ||||
| 6.5. Extension Ordering...................................10 | ||||
| 6.6. TLV versus Bit Fields................................10 | ||||
| 6.7. Control Plane Considerations.........................11 | ||||
| 6.8. Split NVE............................................12 | ||||
| 6.9. Larger VNI Considerations............................12 | ||||
| As communicated by WG Chairs, the NVO3 WG charter states that it may | 7. Design Team Recommendations............................13 | |||
| produce requirements for network virtualization data planes based on | 8. Acknowledgements.......................................16 | |||
| encapsulation of virtual network traffic over an IP-based underlay | ||||
| 9. Security Considerations................................16 | ||||
| 10. IANA Considerations...................................16 | ||||
| 11. References............................................17 | ||||
| 11.1 Normative References.................................17 | ||||
| 11.2 Informative References...............................17 | ||||
| Appendix A: Encapsulations Comparison.....................19 | ||||
| A.1. Overview.............................................19 | ||||
| A.2. Extensibility........................................19 | ||||
| A.2.1. Native Extensibility Support.......................19 | ||||
| A.2.2. Extension Parsing..................................19 | ||||
| A.2.3. Critical Extensions................................20 | ||||
| A.2.4. Maximal Header Length..............................20 | ||||
| A.3. Encapsulation Header.................................20 | ||||
| A.3.1. Virtual Network Identifier (VNI)...................20 | ||||
| A.3.2. Next Protocol......................................20 | ||||
| A.3.3. Other Header Fields................................21 | ||||
| A.4. Comparison Summary...................................21 | ||||
| Contributors..............................................23 | ||||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| 1. Introduction | ||||
| As communicated by the WG Chairs, the NVO3 WG Charter states that it | ||||
| may produce requirements for network virtualization data planes based | ||||
| on encapsulation of virtual network traffic over an IP-based underlay | ||||
| data plane. Such requirements should consider OAM and security. | data plane. Such requirements should consider OAM and security. | |||
| Based on these requirements the WG will select, extend, and/or | Based on these requirements the WG will select, extend, and/or | |||
| develop one or more data plane encapsulation format(s). | develop one or more data plane encapsulation format(s). | |||
| This has led to drafts describing three encapsulations being adopted | This has led to WG drafts and an RFC describing three encapsulations | |||
| by the working group: | as follows: | |||
| - [I-D.ietf-nvo3-geneve] | - [RFC8926] Geneve: Generic Network Virtualization Encapsulation | |||
| - [I-D.ietf-nvo3-gue] | - [I-D.ietf-intarea-gue] Generic UDP Encapsulation | |||
| - [I-D.ietf-nvo3-vxlan-gpe] | - [I-D.ietf-nvo3-vxlan-gpe] Generic Protocol Extension for VXLAN | |||
| (VXLAN-GPE) | ||||
| Discussion on the list and in face-to-face meetings has identified a | Discussion on the list and in face-to-face meetings has identified a | |||
| number of technical problems with each of these encapsulations. | number of technical problems with each of these encapsulations. | |||
| Furthermore, there was clear consensus at the IETF meeting in Berlin | Furthermore, there was clear consensus at the 96th IETF meeting in | |||
| that it is undesirable for the working group to progress more than | Berlin that it is undesirable for the working group to progress more | |||
| one data plane encapsulation. Although consensus could not be | than one data plane encapsulation. Although consensus could not be | |||
| reached on the list, the overall consensus was for a single | reached on the list, the overall consensus was for a single | |||
| encapsulation [RFC2418],Section 3.3. | encapsulation [RFC2418], Section 3.3. | |||
| Nonetheless there has been resistance to converging on a single | Nonetheless there has been resistance to converging on a single | |||
| encapsulation format. | encapsulation format. | |||
| 2. Design Team Goals | 2. Design Team Goals | |||
| As communicated by WG Chairs, the design team should take one of the | As communicated by the WG Chairs, the design team should take one of | |||
| proposed encapsulations and enhance it to address the technical | the proposed encapsulations and enhance it to address the technical | |||
| concerns. The simple evolution of deployed networks as well as | concerns. The simple evolution of deployed networks as well as | |||
| applicability to all locations in the NVO3 architecture are goals. | applicability to all locations in the NVO3 architecture are goals. | |||
| The DT should specifically avoid a design that is burdensome on | The DT should specifically avoid a design that is burdensome on | |||
| hardware implementations, but should allow future extensibility. The | hardware implementations but should allow future extensibility. The | |||
| chosen design should also operate well with ICMP and in ECMP | chosen design should also operate well with ICMP and in ECMP | |||
| environments. If further extensibility is required, then it should | environments. If further extensibility is required, then it should | |||
| be done in such a manner that it does not require the consent of an | be done in such a manner that it does not require the consent of an | |||
| entity outside of the IETF. | entity outside of the IETF. | |||
| 3. Terminology | Internet-Draft NVO3 Encapsulation Considerations | |||
| 3. Terminology | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and | |||
| document are to be interpreted as described in [RFC2119]. | "OPTIONAL" in this document are to be interpreted as described in BCP | |||
| 14 [RFC2119] [RFC8174] when, and only when, they appear in all | ||||
| capitals, as shown here. | ||||
| 4. Abbreviations | 4. Abbreviations and Acronyms | |||
| NVO3 Network Virtualization Overlays over Layer 3 | DT NVO3 encapsulation Design Team | |||
| OAM Operations, Administration, and Maintenance | NVO3 Network Virtualization Overlays over Layer 3 | |||
| TLV Type, Length, and Value | OAM Operations, Administration, and Maintenance | |||
| VNI Virtual Network Identifier | TLV Type, Length, and Value | |||
| NVE Network Virtualization Edge | VNI Virtual Network Identifier | |||
| NVA Network Virtualization Authority | NVE Network Virtualization Edge | |||
| NIC Network interface card | NVA Network Virtualization Authority | |||
| Transit device Underlay network devices between NVE(s). | NIC Network interface card | |||
| 5. Issues with current Encapsulations | TCAM Ternary Content-Addressable Memory | |||
| As summarized by WG Chairs. | Transit device - Underlay network devices between NVE(s). | |||
| 5.1. Geneve | Internet-Draft NVO3 Encapsulation Considerations | |||
| 5. Issues with Current Encapsulations | ||||
| The following subsections describe issues with current encapsulations | ||||
| as summarized by the WG Chairs: | ||||
| 5.1. Geneve | ||||
| - Can't be implemented cost-effectively in all use cases because | - Can't be implemented cost-effectively in all use cases because | |||
| variable length header and order of the TLVs makes is costly (in | variable length header and order of the TLVs makes is costly (in | |||
| terms of number of gates) to implement in hardware | terms of number of gates) to implement in hardware. | |||
| - Header doesn't fit into largest commonly available parse buffer | - Header doesn't fit into largest commonly available parse buffer | |||
| (256 bytes in NIC). Cannot justify doubling buffer size unless it is | (256 bytes in NIC). Cannot justify doubling buffer size unless it is | |||
| mandatory for hardware to process additional option fields. | mandatory for hardware to process additional option fields. | |||
| 5.2. GUE | 5.2. GUE | |||
| - There were a significant number of objections related to the | - There were a significant number of objections related to the | |||
| complexity of implementation in hardware, similar to those noted for | complexity of implementation in hardware, similar to those noted for | |||
| Geneve above. | Geneve above. | |||
| 5.3. VXLAN-GPE | 5.3. VXLAN-GPE | |||
| - GPE is not day-1 backwards compatible with VXLAN. Although the | - GPE is not day-1 backwards compatible with VXLAN. Although the | |||
| frame format is similar, it uses a different UDP port, so would | frame format is similar, it uses a different UDP port, so would | |||
| require changes to existing implementations even if the rest of the | require changes to existing implementations even if the rest of the | |||
| GPE frame is the same. | GPE frame is the same. | |||
| - GPE is insufficiently extensible. Numerous extensions and options | - GPE is insufficiently extensible. Numerous extensions and options | |||
| have been designed for GUE and Geneve. Note that these have not yet | have been designed for GUE and Geneve. Note that these have not yet | |||
| been validated by the WG. | been validated by the WG. | |||
| - Security e.g. of the VNI has not been addressed by GPE. Although a | - Security, e.g., of the VNI, has not been addressed by GPE. | |||
| shim header could be used for security and other extensions, this has | Although a shim header could be used for security and other | |||
| not been defined yet and its implications on offloading in NICs are | extensions, this has not been defined yet and its implications on | |||
| not understood. | offloading in NICs are not understood. | |||
| 6. Common Encapsulation Considerations | Internet-Draft NVO3 Encapsulation Considerations | |||
| 6.1. Current Encapsulations | 6. Common Encapsulation Considerations | |||
| 6.1. Current Encapsulations | ||||
| Appendix A includes a detailed comparison between the three proposed | Appendix A includes a detailed comparison between the three proposed | |||
| encapsulations. The comparison indicates several common properties, | encapsulations. The comparison indicates several common properties | |||
| but also three major differences among the encapsulations: | but also three major differences among the encapsulations: | |||
| - Extensibility: Geneve and GUE were defined with built-in | - Extensibility: Geneve and GUE were defined with built-in | |||
| extensibility, while VXLAN-GPE is not inherently extensible. Note | extensibility, while VXLAN-GPE is not inherently extensible. Note | |||
| that any of the three encapsulations can be extended using the | that any of the three encapsulations can be extended using the | |||
| Network Service Header (NSH). | Network Service Header (NSH [RFC8300]). | |||
| - Extension method: Geneve is extensible using Type/Length/Value | - Extension method: Geneve is extensible using Type/Length/Value | |||
| (TLV) fields, while GUE uses a small set of possible extensions, and | (TLV) fields, while GUE uses a small set of possible extensions, and | |||
| a set of flags that indicate which extension is present. | a set of flags that indicate which extensions are present. | |||
| - Length field: Geneve and GUE include a Length field, indicating the | - Length field: Geneve and GUE include a Length field, indicating the | |||
| length of the encapsulation header, while VXLAN-GPE does not include | length of the encapsulation header while VXLAN-GPE does not include | |||
| such a field. | such a field. | |||
| 6.2. Useful Extensions Use cases | 6.2. Useful Extensions Use Cases | |||
| Non vendor specific TLV MUST follow the standardization process. The | Non vendor specific TLVs MUST follow the standardization process. | |||
| following use cases for extensions shows that there is a strong | The following use cases for extensions shows that there is a strong | |||
| requirement to support variable length extensions with possible | requirement to support variable length extensions with possible | |||
| different subtypes. | different subtypes. | |||
| 6.2.1. Telemetry extensions. | 6.2.1. Telemetry Extensions | |||
| In several scenarios it is beneficial to make information about the | In several scenarios it is beneficial to make information about the | |||
| path a packet took through the network or through a network device as | path a packet took through the network or through a network device as | |||
| well as associated telemetry information available to the operator. | well as associated telemetry information available to the operator. | |||
| This includes not only tasks like debugging, troubleshooting, as well | This includes not only tasks like debugging, troubleshooting, and | |||
| as network planning and network optimization but also policy or | network planning and optimization but also policy or service level | |||
| service level agreement compliance checks. | agreement compliance checks. | |||
| Packet scheduling algorithms, especially for balancing traffic across | Packet scheduling algorithms, especially for balancing traffic across | |||
| equal cost paths or links, often leverage information contained | equal cost paths or links, often leverage information contained | |||
| within the packet, such as protocol number, IP-address or MAC- | within the packet, such as protocol number, IP-address, or MAC- | |||
| address. Probe packets would thus either need to be sent from the | address. Probe packets would thus either need to be sent between the | |||
| exact same endpoints with the exact same parameters, or probe packets | exact same endpoints with the exact same parameters, or probe packets | |||
| would need to be artificially constructed as "fake" packets and | would need to be artificially constructed as "fake" packets and | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| inserted along the path. Both approaches are often not feasible from | inserted along the path. Both approaches are often not feasible from | |||
| an operational perspective, be it that access to the end-system is | an operational perspective, be it that access to the end-system is | |||
| not feasible, or that the diversity of parameters and associated | not feasible, or that the diversity of parameters and associated | |||
| probe packets to be created is simply too large. An in-band | probe packets to be created is simply too large. An extension | |||
| telemetry mechanism in extensions is an alternative in those cases. | providing an in-band telemetry mechanism is an alternative in those | |||
| cases. | ||||
| 6.2.2. Security/Integrity extensions | 6.2.2. Security/Integrity Extensions | |||
| Since the currently proposed NVO3 encapsulations do not protect their | Since the currently proposed NVO3 encapsulations do not protect their | |||
| headers a single bit corruption in the VNI field could deliver a | headers, a single bit corruption in the VNI field could deliver a | |||
| packet to the wrong tenant. Extensions are needed to use any | packet to the wrong tenant. Extensions are needed to use any | |||
| sophisticated security. | sophisticated security. | |||
| The possibility of VNI spoofing with an NVO3 protocol is exacerbated | The possibility of VNI spoofing with an NVO3 protocol is exacerbated | |||
| by the use of UDP. Systems typically have no restrictions on | by using UDP. Systems typically have no restrictions on applications | |||
| applications being able to send to any UDP port so an unprivileged | being able to send to any UDP port so an unprivileged application can | |||
| application can trivially spoof for instance, VXLAN packets, | trivially spoof VXLAN packets for instance, including using arbitrary | |||
| including using arbitrary VNIs. | VNIs. | |||
| One can envision HMAC-like support in some NVO3 extension to | One can envision HMAC-like support in some NVO3 extension to | |||
| authenticate the header and the outer IP addresses, thereby | authenticate the header and the outer IP addresses, thereby | |||
| preventing attackers from injecting packets with spoofed VNIs. | preventing attackers from injecting packets with spoofed VNIs. | |||
| An other aspect of security is payload security. Essentially this is | Another aspect of security is payload security. Essentially this is | |||
| to make packets that look like IP|UDP|NVO3 Encap|DTLS/IPSEC-ESP | to make packets that look like IP|UDP|NVO3 Encap|DTLS/IPSEC-ESP | |||
| Extension|payload. This is nice since we still have the UDP header | Extension|payload. This is nice since we still have the UDP header | |||
| for ECMP, the NVO3 header is in plain text so it can by read by | for ECMP, the NVO3 header is in plain text so it can be read by | |||
| network elements, and different security or other payload transforms | network elements, and different security or other payload transforms | |||
| can be supported on a single UDP port (we don't need a separate UDP | can be supported on a single UDP port (we don't need a separate UDP | |||
| for DTLS/IPSEC). | for DTLS/IPSEC). | |||
| 6.2.3. Group Base Policy | 6.2.3 Group Base Policy | |||
| Another use case would be to carry the Group Based Policy (GBP) | Another use case would be to carry the Group Based Policy (GBP) | |||
| source group information within a NVO3 header extension in a similar | source group information within a NVO3 header extension in a similar | |||
| manner as has been implemented for VXLAN | manner as has been implemented for VXLAN | |||
| [I-D.smith-vxlan-group-policy]. This allows various forms of policy | [I-D.smith-vxlan-group-policy]. This allows various forms of policy | |||
| such as access control and QoS to be applied between abstract groups | such as access control and QoS to be applied between abstract groups | |||
| rather than coupled to specific endpoint addresses. | rather than coupled to specific endpoint addresses. | |||
| 6.3. Hardware Considerations | Internet-Draft NVO3 Encapsulation Considerations | |||
| 6.3. Hardware Considerations | ||||
| Hardware restrictions should be taken into consideration along with | Hardware restrictions should be taken into consideration along with | |||
| future hardware enhancements that may provide more flexible metadata | future hardware enhancements that may provide more flexible metadata | |||
| processing. However, the set of options that need to and will be | processing. However, the set of options that need to and will be | |||
| implemented in hardware will be a subset of what is implemented in | implemented in hardware will be a subset of what is implemented in | |||
| software, since software NVEs are likely to grow features, and hence | software, since software NVEs are likely to grow features, and hence | |||
| option support, at a more rapid rate. | option support, at a more rapid rate. | |||
| We note that it is hard to predict which options will be implemented | We note that it is hard to predict which options will be implemented | |||
| in which piece of hardware and when. That depends on whether the | in which piece of hardware and when. That depends on whether the | |||
| hardware will be in the form of a NIC providing increasing offload | hardware will be in the form of a NIC providing increasing offload | |||
| capabilities to software NVEs, or a switch chip being used as an NVE | capabilities to software NVEs, or a switch chip being used as an NVE | |||
| gateway towards non-NVO3 parts of the network, or even an transit | gateway towards non-NVO3 parts of the network, or even a transit | |||
| devices that participates in the NVO3 dataplane e.g. for OAM | device that participates in the NVO3 dataplane, e.g., for OAM | |||
| purposes. | purposes. | |||
| A result of this is that it doesn't look useful to prescribe some | A result of this is that it doesn't look useful to prescribe some | |||
| order of the option so that the ones that are likely to be | order of the option so that the ones that are likely to be | |||
| implemented in hardware come first; we can't decide such an order | implemented in hardware come first; we can't decide such an order | |||
| when we define the options, however a control plane can enforce such | when we define the options, however a control plane can enforce such | |||
| order for some hardware implementations. | an order for some hardware implementation. | |||
| We do know that hardware needs to initially be able to efficiently | We do know that hardware needs to initially be able to efficiently | |||
| skip over the NVO3 header to find the inner payload. That is needed | skip over the NVO3 header to find the inner payload. That is needed | |||
| for both NICs doing e.g. TCP offload and transit devices and NVEs | both for NICs doing TCP offload and for transit devices and NVEs | |||
| applying policy/ACLs to the inner payload. | applying policy/ACLs to the inner payload. | |||
| 6.4. Extension Size | 6.4. Extension Size | |||
| Extension header length has a significant impact to hardware and | Extension header length has a significant impact on hardware and | |||
| software implementations. A total header length that is too small | software implementations. A total header length that is too small | |||
| will unnecessarily constrained software flexibility. A total header | will unnecessarily constrained software flexibility. A total header | |||
| length that is too large will place a nontrivial cost on hardware | length that is too large will place a nontrivial cost on hardware | |||
| implementations. Thus, the design team recommends that there be a | implementations. Thus, the design team recommends that there be a | |||
| minimum and maximum total extension header length selected. The | minimum and maximum total extension header length selected. The | |||
| maximum total header length is determined by the bits allocated for | maximum total header length is determined by the bits allocated for | |||
| the total extension header length field. The risk with this approach | the total extension header length field. The risk with this approach | |||
| is that it may be difficult to extend the total header size in the | is that it may be difficult to extend the total header size in the | |||
| future. The minimum total header length is determined by a | future. The minimum total header length is determined by a | |||
| requirement in the specifications that all implementations must meet. | requirement in the specifications that all implementations must meet. | |||
| The risk with this approach is that all implementations will only | The risk with this approach is that all implementations will only | |||
| implement the minimum total header length which would then become the | implement the minimum total header length which would then become the | |||
| de facto maximum total header length. The recommended minimum total | de facto maximum total header length. The recommended minimum total | |||
| header length is 64 bytes. | header length is 64 bytes. | |||
| Single Extension size should always be 4 bytes aligned. | Single Extension size should always be 4 byte aligned. | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| The maximum length of a single option should be large enough to meet | The maximum length of a single option should be large enough to meet | |||
| the different extension use case requirements e.g. in-band telemetry | the different extension use case requirements, e.g., in-band | |||
| and future use. | telemetry and future use. | |||
| 6.5. Extension Ordering | 6.5. Extension Ordering | |||
| In order to support hardware nodes at the tunnel endpoint or at the | To support hardware nodes at the tunnel endpoint or at a transit | |||
| transit that can process one or few extensions TLVs in TCAM. A | device that can process one or a few extensions TLVs in TCAM, a | |||
| control plane in such a deployment can signal a capability to ensure | control plane in such a deployment can signal a capability to ensure | |||
| a specific TLV will always appear in a specific order for example the | a specific TLV will always appear in a specific order, for example | |||
| first one in the packet. | the first one in the packet. | |||
| The order of the TLVs should be HW friendly for both the sender and | The order of the TLVs should be hardware friendly for both the sender | |||
| the receiver and possibly the transit node too. | and the receiver and possibly the transit device also. | |||
| Transit nodes doesn't participate in control plane communication | Transit devices doesn't participate in control plane communication | |||
| between the end points and are not required to process the options | between the end points and are not required to process the options; | |||
| however, if they do, they need to process only a small subset of | however, if they do, they need to process only a small subset of | |||
| options that will be consumed by tunnel endpoints. | options that will be consumed by tunnel endpoints. | |||
| 6.6. TLV vs Bit Fields | 6.6. TLV versus Bit Fields | |||
| If there is a well-known initial set of options that are likely to be | If there is a well-known initial set of options that are likely to be | |||
| implemented in software and in hardware, it can be efficient to use | implemented in software and in hardware, it can be efficient to use | |||
| the bit-field approach as in GUE. However, as described in section | the bit-field approach as in GUE. However, as described in section | |||
| 6.3, if options are added over time and different subsets of options | 6.3, if options are added over time and different subsets of options | |||
| are likely to be implemented in different pieces of hardware, then it | are likely to be implemented in different pieces of hardware, then it | |||
| would be hard for the IETF to specify which options should get the | would be hard for the IETF to specify which options should get the | |||
| early bit fields. TLVs are a lot more flexible, which avoids the | early bit fields. TLVs are a lot more flexible, which avoids the | |||
| need to determine the relative importance different options. | need to determine the relative importance different options. | |||
| However, general TLV of arbitrary order, size, and repetition of the | However, general TLV of arbitrary order, size, and repetition of the | |||
| same order is difficult to implement in hardware. A middle ground is | same order is difficult to implement in hardware. A middle ground is | |||
| to use TLV with restrictions on the size and alignment, observing | to use TLVs with restrictions on their size and alignment, observing | |||
| that individual TLVs can have a fixed length, and support in the | that individual TLVs can have a fixed length, and support in the | |||
| control plane such that an NVE will only receive options that to | control plane such that an NVE will only receive options that it | |||
| needs and implements. The control plane approach can potentially be | needs and implements. The control plane approach can potentially be | |||
| used to control the order of the TLVs sent to a particular NVE. Note | used to control the order of the TLVs sent to a particular NVE. Note | |||
| that transit devices are not likely to participate in the control | that transit devices are not likely to participate in the control | |||
| plane hence to the extent that they need to participate in option | plane; hence, to the extent that they need to participate in option | |||
| processing they need more effort, But transit devices would have | processing, they need more effort. Transit devices would have issues | |||
| issues with future GUE bits being defined for future options as well. | with future GUE bits being defined for future options as well. | |||
| A benefit of TLVs from a HW perspective is that they are self | A benefit of TLVs from a hardware perspective is that they are self | |||
| describing i.e., all the information is in the TLV. In a Bit fields | describing, i.e., all the information is in the TLV. In a Bit fields | |||
| approach the hardware needs to look up the bit to determine the | approach the hardware needs to look up the bit to determine the | |||
| length of the data associated with the bit through some separate | length of the data associated with the bit through some separate | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| table, which would add hardware complexity. | table, which would add hardware complexity. | |||
| There are use cases where multiple modules of software are running on | There are use cases where multiple modules of software are running on | |||
| NVE. This can be modules such as a diagnostic module by one vendor | an NVE. This can be modules such as a diagnostic module by one | |||
| that does packet sampling and another module from a different vendor | vendor that does packet sampling and another module from a different | |||
| that does a firewall. Using a TLV format, it is easier to have | vendor that does a firewall. Using a TLV format, it is easier to | |||
| different software modules process different TLVs, which could be | have different software modules process different TLVs, which could | |||
| standard extensions or vendor specific extensions defined by the | be standard extensions or vendor specific extensions defined by the | |||
| different vendors, without conflicting with each other. This can | different vendors, without conflicting with each other. This can | |||
| help with hardware modularity as well. There are some | help with hardware modularity as well. There are some | |||
| implementations with options that allows different software like mac | implementations with options that allows different software, like MAC | |||
| learning and security handle different options. | learning and security, to handle different options. | |||
| 6.7. Control Plane Considerations | 6.7. Control Plane Considerations | |||
| Given that we want to allow large flexibility and extensibility for | Given that we want to allow considerable flexibility and | |||
| e.g. software NVEs, yet be able to support key extensions in less | extensibility for, e.g., software NVEs, yet be able to support | |||
| flexible e.g. hardware NVEs, it is useful to consider the control | important extensions in less flexible contexts such as hardware NVEs, | |||
| plane. By control plane in this context we mean both protocols such | it is useful to consider the control plane. By control plane in this | |||
| as EVPN and others, and also deployment specific configuration. | section we mean both protocols, such as EVPN and others, and | |||
| deployment specific configuration. | ||||
| If each NVE can express in the control plane that they only care | If each NVE can express in the control plane that they only care | |||
| about particular extensions (could be a single extension, or a few), | about particular extensions (could be a single extension, or a few), | |||
| and the source NVEs only include requested extensions in the NVO3 | and the source NVEs only include requested extensions in the NVO3 | |||
| packets, then the target NVE can both use a simpler parser (e.g., a | packets, then the target NVE can both use a simpler parser (e.g., a | |||
| TCAM might be usable to look for a single NVO3 extension) and the | TCAM might be usable to look for a single NVO3 extension) and the | |||
| depth of the inner payload in the NVO3 packet will be minimized. | depth of the inner payload in the NVO3 packet will be minimized. | |||
| Furthermore, if the target NVE cares about a few extensions and can | Furthermore, if the target NVE cares about a few extensions and can | |||
| express in the control plane the desired order of those extensions in | express in the control plane the desired order of those extensions in | |||
| the NVO3 packets, then it can provide useful functionality with | the NVO3 packets, then it can provide useful functionality with | |||
| minimal hardware requirements. | minimal hardware requirements. | |||
| Note that transit devices that are not aware of the NVO3 extensions | Note that transit devices that are not aware of the NVO3 extensions | |||
| somewhat benefit from such an approach, since the inner payload is | somewhat benefit from such an approach, since the inner payload is | |||
| less deep in the packet if no extraneous extensions are included in | less deep in the packet if no extraneous extensions are included in | |||
| the packet. However, in general a transit device is not likely to | the packet. In general, a transit device is not likely to | |||
| participate in the NVO3 control plane. (However, configuration | participate in the NVO3 control plane. (However, configuration | |||
| mechanisms can take into account limitations of the transit devices | mechanisms can take into account limitations of the transit devices | |||
| used in particular deployments.) | used in particular deployments.) | |||
| Note that in this approach different NVEs could desire different | Note that in this approach different NVEs could desire different | |||
| (sets of) extensions, which means that the source NVE needs to be | extensions or sets of extensions, which means that the source NVE | |||
| able to place different sets of extensions in different NVO3 packets, | needs to be able to place different sets of extensions in different | |||
| and perhaps in different order. It also assumes that underlay | NVO3 packets, and perhaps in different order. It also assumes that | |||
| multicast or replication servers are not used together with NVO3 | underlay multicast or replication servers are not used together with | |||
| extensions. | NVO3 extensions. | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| There is a need to consider mandatory extensions versus optional | There is a need to consider mandatory extensions versus optional | |||
| extensions. Mandatory extensions require the receiver to drop the | extensions. Mandatory extensions require the receiver to drop the | |||
| packet if the extension is unknown. A control plane mechanism can | packet if the extension is unknown. A control plane mechanism can | |||
| prevent the need for dropping unknown extensions, since they would | prevent the need for dropping unknown extensions, since they would | |||
| not be included to targets that do not support them. | not be included to targets that do not support them. | |||
| The control planes defined today need to add the ability to describe | The control planes defined today need to add the ability to describe | |||
| the different encapsulations. Thus perhaps EVPN, and any other | the different encapsulations. Thus, perhaps EVPN and any other | |||
| control plane protocol that the IETF defines, should have a way to | control plane protocol that the IETF defines should have a way to | |||
| enumerate the supported NVO3 extensions and their order. | enumerate the supported NVO3 extensions and their order. | |||
| The WG should consider developing a separate draft on guidance for | The WG should consider developing a separate draft on guidance for | |||
| option processing and control plane participation. This should | option processing and control plane participation. This should | |||
| provide examples/guidance on range of usage models and deployments | provide examples/guidance on range of usage models and deployments | |||
| scenarios for specific options and ordering that are relevant for | scenarios for specific options and ordering that are relevant for | |||
| that specific deployment. This includes end points and middle boxes | that specific deployment. This includes end points and middle boxes | |||
| using the options. So, having the control plane negotiate the | using the options. So, having the control plane negotiate the | |||
| constraints is most appropriate and flexible way to address these | constraints is the most appropriate and flexible way to address these | |||
| requirements. | requirements. | |||
| 6.8. Split NVE | 6.8. Split NVE | |||
| If the working group sees a need for having the hosts send and | If the working group sees a need for having the hosts send and | |||
| receive options in a split NVE case, this is possible using any of | receive options in a split NVE case, this is possible using any of | |||
| the existing extensible encapsulations (Geneve, GUE, GPE+NSH) by | the existing extensible encapsulations (Geneve, GUE, GPE+NSH) by | |||
| defining a way to carry those over other transports. NSH can already | defining a way to carry those over other transports. NSH can already | |||
| be used over different transports. | be used over different transports. | |||
| If we need to do this with other encapsulations it can be done by | If we need to do this with other encapsulations it can be done by | |||
| defining an Ether type for other encapsulations so that it can be | defining an Ether type for other encapsulations so that it can be | |||
| carried over Ethernet and 802.1Q. | carried over Ethernet and 802.1Q. | |||
| If we need to carry other encapsulations over MPLS, it would require | If we need to carry other encapsulations over MPLS, it would require | |||
| an EVPN control plane to signal that other encapsulation header + | an EVPN control plane to signal that other encapsulation header + | |||
| options will be present in front of the L2 packet. The VNI can be | options will be present in front of the L2 packet. The VNI can be | |||
| ignored in the header, and the MPLS label will be the one used to | ignored in the header, and the MPLS label will be the one used to | |||
| identify the EVPN L2 instance. | identify the EVPN L2 instance. | |||
| 6.9. Larger VNI Considerations | 6.9. Larger VNI Considerations | |||
| We discussed whether we should make VNI 32-bits or larger. The | We discussed whether we should make the VNI 32-bits or larger. The | |||
| benefit of 24-bit VNI would be to avoid unnecessary changes with | benefit of a 24-bit VNI would be to avoid unnecessary changes with | |||
| existing proposals and implementations that are almost all, if not | existing proposals and implementations that are almost all, if not | |||
| all, are using 24-bit VNI. If we need a larger VNI, an extension can | all, using 24-bit VNI. If we need a larger VNI, an extension can be | |||
| be used to support that. | used to support that. | |||
| 7. Design team recommendations | Internet-Draft NVO3 Encapsulation Considerations | |||
| We concluded that Geneve is most suitable as a starting point for | 7. Design Team Recommendations | |||
| We concluded that Geneve is most suitable as a starting point for a | ||||
| proposed standard for network virtualization, for the following | proposed standard for network virtualization, for the following | |||
| reasons: | reasons: | |||
| 1. We studied whether VNI should be in base header or in extensions | 1. We studied whether VNI should be in the base header or in | |||
| and whether it should be 24-bit or 32-bit. The design team agreed | extensions and whether it should be 24-bit or 32-bit. The design | |||
| that VNI is critical information for network virtualization and MUST | team agreed that VNI is critical information for network | |||
| be present in all packets. Design team also agreed that 24-bit VNI | virtualization and MUST be present in all packets. The design team | |||
| matches the existing widely used encapsulation format i.e. VxLAN and | also agreed that a 24-bit VNI matches the existing widely used | |||
| NVGRE and hence more suitable to use going forward. | encapsulation formats, i.e., VxLAN and NVGRE, and hence is more | |||
| suitable to use going forward. | ||||
| 2. Geneve has the total options length that allow skipping over the | 2. The Geneve header has the total options length which allows | |||
| options for NIC offload operations, and will allow transit devices to | skipping over the options for NIC offload operations and will allow | |||
| view flow information in the inner payload. | transit devices to view flow information in the inner payload. | |||
| 3. We considered the option of using NSH with VxLAN-GPE but given | 3. We considered the option of using NSH [RFC8300] with VxLAN-GPE | |||
| that NSH is targeted at service chaining and contains service | but given that NSH is targeted at service chaining and contains | |||
| chaining information, it is less suitable for the network | service chaining information, it is less suitable for the network | |||
| virtualization use case. The other downside for VxLAN-GPE was lack | virtualization use case. The other downside for VxLAN-GPE was lack | |||
| of header length in VxLAN-GPE and hence makes skipping over the | of header length in VxLAN-GPE which makes skipping over the headers | |||
| headers to process inner payload more difficult. Total Option Length | to process inner payload more difficult. Total Option Length is | |||
| is present in Geneve. It is not possible to skip any options in the | present in Geneve. It is not possible to skip any options in the | |||
| middle with VxLAN-GPE. In principle a split between a base header | middle with VxLAN-GPE. In principle a split between a base header | |||
| and a header with options is interesting (whether that options header | and a header with options is interesting (whether that options header | |||
| is NSH or some new header without ties to a service path). We | is NSH or some new header without ties to a service path). We | |||
| explored whether it would make sense to either use NSH for this, or | explored whether it would make sense to either use NSH for this, or | |||
| define a new NVO3 options header. However, we observed that this | define a new NVO3 options header. However, we observed that this | |||
| makes it slightly harder to find the inner payload since the length | makes it slightly harder to find the inner payload since the length | |||
| field is not in the NVO3 header itself. Thus one more field would | field is not in the NVO3 header itself. Thus, one more field would | |||
| have to be extracted to compute the start of the inner payload. | have to be extracted to compute the start of the inner payload. | |||
| Also, if the experience with IPv6 extension headers is a guidance, | Also, if the experience with IPv6 extension headers is a guide, there | |||
| there would be a risk that key pieces of hardware might not implement | would be a risk that key pieces of hardware might not implement the | |||
| the options header, resulting in future calls to deprecate its use. | options header, resulting in future calls to deprecate its use. | |||
| Making the options part of the base NVO3 header has less of those | Making the options part of the base NVO3 header has less of those | |||
| issues. Even though the implementation of any particular option can | issues. Even though the implementation of any particular option can | |||
| not be predicted ahead of time, the option mechanism and ability to | not be predicted ahead of time, the option mechanism and ability to | |||
| skip the options is likely to be broadly implemented. | skip the options is likely to be broadly implemented. | |||
| 4. We compared the TLV vs Bit-fields style extension and it was | 4. We compared the TLV vs Bit-fields style extension and it was | |||
| deemed that parsing both TLV and bit-fields is expensive and while | deemed that parsing both TLV and bit-fields is expensive and while | |||
| bit-fields may be simpler to parse, it is also more restrictive and | bit-fields may be simpler to parse, it is also more restrictive and | |||
| requires guessing which extensions will be widely implemented so they | requires guessing which extensions will be widely implemented so they | |||
| can get early bit assignments, given that half the bits are already | can get early bit assignments, given that half the bits are already | |||
| assigned in GUE, a widely deployed extension may appear in a flag | assigned in GUE, a widely deployed extension may appear in a flag | |||
| extension, and this will require extra processing, to dig the flag | extension, and this will require extra processing, to dig the flag | |||
| from the flag extension and then look for the extension itself. As | from the flag extension and then look for the extension itself. Also | |||
| well Bit-fields are not flexible enough to address the requirements | Bit-fields are not flexible enough to address the requirements from | |||
| from OAM, Telemetry and security extensions, for variable length | ||||
| option and different subtypes of the same option. While TLV are more | Internet-Draft NVO3 Encapsulation Considerations | |||
| OAM, Telemetry, and security extensions, for variable length option | ||||
| and different subtypes of the same option. While TLV are more | ||||
| flexible, a control plane can restrict the number of option TLVs as | flexible, a control plane can restrict the number of option TLVs as | |||
| well the order and size of the TLVs to make it simpler for a | well the order and size of the TLVs to make it simpler for a | |||
| dataplane implementation to handle. | dataplane implementation to handle. | |||
| 5. We briefly discussed multi-vendor NVE case, and the need to allow | 5. We briefly discussed the multi-vendor NVE case, and the need to | |||
| vendors to put their own extensions in the NVE header. This is | allow vendors to put their own extensions in the NVE header. This is | |||
| possible with TLVs. | possible with TLVs. | |||
| 6. We also agreed that the C bit in Geneve is helpful to allow | 6. We also agreed that the C bit in Geneve is helpful to allow a | |||
| receiver NVE to easily decide whether to process options or not. For | receiver NVE to easily decide whether to process options or not, for | |||
| example a UUID based packet trace and how an optional extension such | example a UUID based packet trace, and how an optional extension such | |||
| as that can be ignored by receiver NVE and thus make it easy for NVE | as that can be ignored by a receiver NVE and thus make it easy for | |||
| to skip over the options. Thus the C-bit remains as defined in | NVE to skip over the options. Thus, the C-bit remains as defined in | |||
| Geneve. | Geneve. | |||
| 7. There are already some extensions that are being discussed (see | 7. There are already some extensions that are being discussed (see | |||
| section 6.2) of varying sizes, by using Geneve option it is possible | section 6.2) of varying sizes. By using Geneve option it is possible | |||
| to get in band parameters like: switch id, ingress port, egress port, | to get in band parameters like switch id, ingress port, egress port, | |||
| internal delay, and queue in telemetry defined extension TLV from | internal delay, and queue in telemetry defined extension TLV from | |||
| switches. It is also possible to add Security extension TLVs like | switches. It is also possible to add Security extension TLVs like | |||
| HMAC and DTLS/IPSEC to authenticate the Geneve packet header and | HMAC and DTLS/IPSEC to authenticate the Geneve packet header and | |||
| secure the Geneve packet payload by software or hardware tunnel | secure the Geneve packet payload by software or hardware tunnel | |||
| endpoints. As well, a Group Based Policy extension TLV can be | endpoints. A Group Based Policy extension TLV can be carried as | |||
| carried. | well. | |||
| 8. There are implemented Geneve options today in production. There | 8. There are implemented Geneve options today in production. There | |||
| are as well new HW supporting Geneve TLV parsing. In addition In- | are as well new hardware supporting Geneve TLV parsing. In addition, | |||
| band Telemetry (INT) specification being developed by P4.org | an In-band Telemetry (INT) specification is being developed by P4.org | |||
| illustrates the option of INT meta data carried over Geneve. OVN/OVS | that illustrates the option of INT meta data carried over Geneve. | |||
| have also defined some option TLV(s) for Geneve. | OVN/OVS have also defined some option TLV(s) for Geneve. | |||
| 9. The DT has addressed the usage models while considering the | 9. The DT has addressed the usage models while considering the | |||
| requirements and implementations in general that includes software | requirements and implementations in general that includes software | |||
| and hardware. | and hardware. | |||
| There seems to be interest to standardize some well known secure | There seems to be interest to standardize some well-known secure | |||
| option TLVs to secure the header and payload to guarantee | option TLVs to secure the header and payload to guarantee | |||
| encapsulation header integrity and tenant data privacy. The design | encapsulation header integrity and tenant data privacy. The design | |||
| team recommends that the working group consider standardizing such | team recommends that the working group consider standardizing such | |||
| option(s). | option(s). | |||
| We recommend the following enhancements to Geneve to make it more | We recommend the following enhancements to Geneve to make it more | |||
| suitable to hardware and yet provide the flexibility for software: | suitable to hardware and yet provide the flexibility for software: | |||
| We would propose a text such as, while TLV are more flexible, a | We would propose a text such as, while TLV are more flexible, a | |||
| control plane can restrict the number of option TLVs as well the | control plane can restrict the number of option TLVs as well the | |||
| order and size of the TLVs to make it simpler for a data plane | order and size of the TLVs to make it simpler for a data plane | |||
| implementation in software or hardware to handle. For example, there | implementation in software or hardware to handle. For example, there | |||
| may be some critical information such as secure hash that must be | ||||
| processed in certain order at lowest latency. | Internet-Draft NVO3 Encapsulation Considerations | |||
| may be some critical information such as a secure hash that must be | ||||
| processed in a certain order at lowest latency. | ||||
| A control plane can negotiate a subset of option TLVs and certain TLV | A control plane can negotiate a subset of option TLVs and certain TLV | |||
| ordering, as well can limit the total number of option TLVs present | ordering, as well as limiting the total number of option TLVs present | |||
| in the packet, for example, to allow hardware capable of processing | in the packet, for example, to allow for hardware capable of | |||
| fewer options. Hence, the control planes need to have the ability to | processing fewer options. Hence, the control plane needs to have the | |||
| describe the supported TLVs subset and their order. | ability to describe the supported TLVs subset and their order. | |||
| The Geneve draft could specify that the subset and order of option | The Geneve draft could specify that the subset and order of option | |||
| TLVs should be configurable for each remote NVE in the absence of a | TLVs should be configurable for each remote NVE in the absence of a | |||
| protocol control plane. | protocol control plane. | |||
| We recommend Geneve to follow fragmentation recommendations in | We recommend that Geneve follow fragmentation recommendations in | |||
| overlay services like PWE3, and L2/L3 VPN recommendation to guarantee | overlay services like PWE3 and the L2/L3 VPN recommendations to | |||
| larger MTU for the tunnel overhead [RFC3985],Section 5.3. | guarantee larger MTU for the tunnel overhead ([RFC3985] Section 5.3). | |||
| We request Geneve to provide a recommendation for critical bit | We request that Geneve provide a recommendation for critical bit | |||
| processing - text could look like how critical bits can be used with | processing - text could specify how critical bits can be used with | |||
| control plane specifying the critical options. | control plane specifying the critical options. | |||
| Given that there is a telemetry option use case for a length of 256 | Given that there is a telemetry option use case for a length of 256 | |||
| bytes, we recommend Geneve to increase the Single TLV option length | bytes, we recommend that Geneve increase the Single TLV option length | |||
| to 256. | to 256. | |||
| We request Geneve to address Requirements for OAM considerations for | We request that Geneve address Requirements for OAM considerations | |||
| alternate marking and for performance measurements that need 2 bits | for alternate marking and for performance measurements that need 2 | |||
| in the header. And clarify the need of the current OAM bit in the | bits in the header and clarify the need for the current OAM bit in | |||
| Geneve Header. | the Geneve Header. | |||
| We recommend the WG to work on security options for Geneve. | We recommend that the WG work on security options for Geneve. | |||
| 8. Acknowledgements | Internet-Draft NVO3 Encapsulation Considerations | |||
| 8. Acknowledgements | ||||
| The authors would like to thank Tom Herbert for providing the | The authors would like to thank Tom Herbert for providing the | |||
| motivation for the Security/Integrity extension, and for his valuable | motivation for the Security/Integrity extension, and for his valuable | |||
| comments, and would like to thank T. Sridhar for his valuable | comments, and would like to thank T. Sridhar for his valuable | |||
| comments and feedback. | comments and feedback. | |||
| 9. Security Considerations | 9. Security Considerations | |||
| This document does not introduce any additional security constraints. | This document does not introduce any additional security constraints. | |||
| 10. IANA Considerations | 10. IANA Considerations | |||
| This document has no actions for IANA. | This document has no actions for IANA. | |||
| 11. Appendix A | Internet-Draft NVO3 Encapsulation Considerations | |||
| 11.1. Overview | 11. References | |||
| 11.1 Normative References | ||||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | ||||
| Requirement Levels", BCP 14, RFC 2119, DOI | ||||
| 10.17487/RFC2119, March 1997, <https://www.rfc- | ||||
| editor.org/info/rfc2119>. | ||||
| [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 | ||||
| Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May | ||||
| 2017, <https://www.rfc-editor.org/info/rfc8174>. | ||||
| 11.2 Informative References | ||||
| [I-D.herbert-gue-extensions] Herbert, T., Yong, L., and F. Templin, | ||||
| "Extensions for Generic UDP Encapsulation", | ||||
| draft-herbert-gue-extensions-01 (work in progress), October | ||||
| 2016. | ||||
| [I-D.ietf-intarea-gue] Herbert, T., Yong, L., and O. Zia, "Generic | ||||
| UDP Encapsulation", draft-ietf-intarea-gue (work in | ||||
| progress), October 2019. | ||||
| [I-D.ietf-nvo3-vxlan-gpe] Maino, F., Kreeger, L., and U. Elzur, | ||||
| "Generic Protocol Extension for VXLAN", | ||||
| draft-ietf-nvo3-vxlan-gpe (work in progress), March 2021. | ||||
| [I-D.smith-vxlan-group-policy] Smith, M. and L. Kreeger, "VXLAN Group | ||||
| Policy Option", draft-smith-vxlan-group-policy-05 (work in | ||||
| progress), October 2018. | ||||
| [RFC2418] Bradner, S., "IETF Working Group Guidelines and | ||||
| Procedures", BCP 25, RFC 2418, DOI 10.17487/RFC2418, | ||||
| September 1998, <https://www.rfc-editor.org/info/rfc2418>. | ||||
| [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation | ||||
| Edge-to-Edge (PWE3) Architecture", RFC 3985, DOI | ||||
| 10.17487/RFC3985, March 2005, <https://www.rfc- | ||||
| editor.org/info/rfc3985>. | ||||
| [RFC8300] Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., | ||||
| "Network Service Header (NSH)", RFC 8300, DOI | ||||
| 10.17487/RFC8300, January 2018, <https://www.rfc- | ||||
| editor.org/info/rfc8300>. | ||||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| [RFC8926] Gross, J., Ed., Ganga, I., Ed., and T. Sridhar, Ed., | ||||
| "Geneve: Generic Network Virtualization Encapsulation", RFC | ||||
| 8926, DOI 10.17487/RFC8926, November 2020, | ||||
| <https://www.rfc-editor.org/info/rfc8926>. | ||||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| Appendix A: Encapsulations Comparison | ||||
| A.1. Overview | ||||
| This section presents a comparison of the three NVO3 encapsulation | This section presents a comparison of the three NVO3 encapsulation | |||
| proposals, Geneve, GUE, and VXLAN-GPE. The three encapsulations use | proposals, Geneve, GUE, and VXLAN-GPE. The three encapsulations use | |||
| an outer UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet | an outer UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet | |||
| header, while GUE uses a 4-octet header. In addition to the base | header, while GUE uses a 4-octet header. In addition to the base | |||
| header, optional extensions may be included in the encapsulation, as | header, optional extensions may be included in the encapsulation, as | |||
| discussed in Section 3.2 below. | discussed in Section A.2 below. | |||
| 11.2. Extensibility | A.2. Extensibility | |||
| 11.2.1. Native Extensibility Support | A.2.1. Native Extensibility Support | |||
| The Geneve and GUE encapsulations both enable optional headers to be | The Geneve and GUE encapsulations both enable optional headers to be | |||
| incorporated at the end of the base encapsulation header. | incorporated at the end of the base encapsulation header. | |||
| VXLAN-GPE does not provide native support for header extensions. | VXLAN-GPE does not provide native support for header extensions. | |||
| However, as discussed in [I-D.ietf-nvo3-vxlan-gpe], extensibility can | However, as discussed in [I-D.ietf-nvo3-vxlan-gpe], extensibility can | |||
| be attained to some extent if the Network Service Header (NSH) | be attained to some extent if the Network Service Header (NSH) | |||
| [RFC8300] is used immediately following the VXLAN-GPE header. NSH | [RFC8300] is used immediately following the VXLAN-GPE header. NSH | |||
| supports either a fixed-size extension (MD Type 1), or a variable- | supports either a fixed-size extension (MD Type 1), or a variable- | |||
| size TLV-based extension (MD Type 2). It should be noted that NSH- | size TLV-based extension (MD Type 2). It should be noted that NSH- | |||
| over-VXLAN-GPE implies an additional overhead of the 8- octets NSH | over-VXLAN-GPE implies an additional overhead of the 8-octets NSH | |||
| header, in addition to the VXLAN-GPE header. | header, in addition to the VXLAN-GPE header. | |||
| 11.2.2. Extension Parsing | A.2.2. Extension Parsing | |||
| The Geneve Variable Length Options are defined as Type/Length/ | The Geneve Variable Length Options are defined as Type/Length/Value | |||
| Value(TLV) extensions. Similarly, VXLAN-GPE, when using NSH, can | (TLV) extensions. Similarly, VXLAN-GPE, when using NSH, can include | |||
| include NSH TLV-based extensions. In contrast, GUE defines a small | NSH TLV-based extensions. In contrast, GUE defines a small set of | |||
| set of possible extension fields (proposed in | possible extension fields (proposed in [I-D.herbert-gue-extensions]), | |||
| [I-D.herbert-gue-extensions], and a set of flags in the GUE header | and a set of flags in the GUE header that indicate for each extension | |||
| that indicate for each extension type whether it is present or not. | type whether it is present or not. | |||
| TLV-based extensions, as defined in Geneve, provide the flexibility | TLV-based extensions, as defined in Geneve, provide the flexibility | |||
| for a large number of possible extension types. Similar behavior can | for a large number of possible extension types. Similar behavior can | |||
| be supported in NSH-over-VXLAN-GPE when using MD Type 2. The flag- | be supported in NSH-over-VXLAN-GPE when using MD Type 2. The flag- | |||
| based approach taken in GUE strives to simplify implementations by | based approach taken in GUE strives to simplify implementations by | |||
| defining a small number of possible extensions, used in a fixed | defining a small number of possible extensions used in a fixed order. | |||
| order. | ||||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| The Geneve and GUE headers both include a length field, defining the | The Geneve and GUE headers both include a length field, defining the | |||
| total length of the encapsulation, including the optional extensions. | total length of the encapsulation, including the optional extensions. | |||
| The length field simplifies the parsing of transit devices that skip | The length field simplifies the parsing of transit devices that skip | |||
| the encapsulation header without parsing its extensions. | the encapsulation header without parsing its extensions. | |||
| 11.2.3. Critical Extensions | A.2.3. Critical Extensions | |||
| The Geneve encapsulation header includes the 'C' field, which | The Geneve encapsulation header includes the 'C' field, which | |||
| indicates whether the current Geneve header includes critical | indicates whether the current Geneve header includes critical | |||
| options, which must be parsed by the tunnel endpoint. If the | options, that is to say, options which must be parsed by the tunnel | |||
| endpoint is not able to process the critical option, the packet is | endpoint. If the endpoint is not able to process a critical option, | |||
| discarded. | the packet is discarded. | |||
| 11.2.4. Maximal Header Length | A.2.4. Maximal Header Length | |||
| The maximal header length in Geneve, including options, is 260 | The maximal header length in Geneve, including options, is 260 | |||
| octets. GUE defines the maximal header to be 128 octets. VXLAN-GPE | octets. GUE defines the maximal header to be 128 octets. VXLAN-GPE | |||
| uses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE is | uses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE is | |||
| used, yielding an encapsulation header of up to 264 octets. | used, yielding an encapsulation header of up to 264 octets. | |||
| 11.3. Encapsulation Header | A.3. Encapsulation Header | |||
| 11.3.1. Virtual Network Identifier (VNI) | A.3.1. Virtual Network Identifier (VNI) | |||
| The Geneve and VXLAN-GPE headers both include a 24-bit VNI field. | The Geneve and VXLAN-GPE headers both include a 24-bit VNI field. | |||
| GUE, on the other hand, enables the use of a 32-bit field called | GUE, on the other hand, enables the use of a 32-bit field called | |||
| VNID; this field is not included in the GUE header, but was defined | VNID; this field is not included in the GUE header, but was defined | |||
| as an optional extension in [I-D.herbert-gue-extensions]. | as an optional extension in [I-D.herbert-gue-extensions]. | |||
| The VXLAN-GPE header includes the 'I' bit, indicating that the VNI | The VXLAN-GPE header includes the 'I' bit, indicating that the VNI | |||
| field is valid in the current header. A similar indicator is defined | field is valid in the current header. A similar indicator is defined | |||
| as a flag in the GUE header herbert-gue-extensions. | as a flag in the GUE header [I-D.herbert-gue-extensions]. | |||
| 11.3.2. Next Protocol | A.3.2. Next Protocol | |||
| The three encapsulation headers include a field that specifies the | The three encapsulation headers include a field that specifies the | |||
| type of the next protocol header, which resides after the NVO3 | type of the next protocol header, which resides after the NVO3 | |||
| encapsulation header. The Geneve header includes a 16-bit field that | encapsulation header. The Geneve header includes a 16-bit field that | |||
| uses the IEEE Ethertype convention. GUE uses an 8-bit field, which | uses the IEEE Ethertype convention. GUE uses an 8-bit field, which | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| uses the IANA Internet protocol numbering. The VXLAN-GPE header | uses the IANA Internet protocol numbering. The VXLAN-GPE header | |||
| incorporates an 8-bit Next Protocol field, using a VXLAN-GPE-specific | incorporates an 8-bit Next Protocol field, using a VXLAN-GPE-specific | |||
| registry, defined in [I-D.ietf-nvo3-vxlan-gpe]. | registry, defined in [I-D.ietf-nvo3-vxlan-gpe]. | |||
| The VXLAN-GPE header also includes the 'P' bit, which explicitly | The VXLAN-GPE header also includes the 'P' bit, which explicitly | |||
| indicates whether the Next Protocol field is present in the current | indicates whether the Next Protocol field is present in the current | |||
| header. | header. | |||
| 11.3.3. Other Header Fields | A.3.3. Other Header Fields | |||
| The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicates | The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicates | |||
| whether the current packet is an OAM packet. The GUE header includes | whether the current packet is an OAM packet. The GUE header includes | |||
| a similar field, but uses different terminology; the GUE 'C-bit' | a similar field, but uses different terminology; the GUE 'C-bit' | |||
| specifies whether the current packet is a control packet. Note that | specifies whether the current packet is a control packet. Note that | |||
| the GUE control bit can potentially be used in a large set of | the GUE control bit can potentially be used in a large set of | |||
| protocols that are not OAM protocols. However, the control packet | protocols that are not OAM protocols. However, the control packet | |||
| examples discussed in [I-D.ietf-nvo3-gue] are OAM-related. | examples discussed in [I-D.ietf-intarea-gue] are OAM-related. | |||
| Each of the three NVO3 encapsulation headers includes a 2-bit Version | Each of the three NVO3 encapsulation headers includes a 2-bit Version | |||
| field, which is currently defined to be zero. | field, which is currently defined to be zero. | |||
| The Geneve and VXLAN-GPE headers include reserved fields; 14 bits in | The Geneve and VXLAN-GPE headers include reserved fields; 14 bits in | |||
| the Geneve header, and 27 bits in the VXLAN-GPE header are reserved. | the Geneve header, and 27 bits in the VXLAN-GPE header are reserved. | |||
| 11.4. Comparison Summary | A.4. Comparison Summary | |||
| Internet-Draft NVO3 Encapsulation Considerations | ||||
| The following table summarizes the comparison between the three NVO3 | The following table summarizes the comparison between the three NVO3 | |||
| encapsulations. | encapsulations: | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| | | Geneve | GUE | VXLAN-GPE | | | | Geneve | GUE | VXLAN-GPE | | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| | Outer transport| UDP/IP | UDP/IP | UDP/IP | | | Outer transport| UDP/IP | UDP/IP | UDP/IP | | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| | Base header | 8 octets | 4 octets | 8 octets | | | Base header | 8 octets | 4 octets | 8 octets | | |||
| | length | | | (16 octets | | | length | | | (16 octets | | |||
| | | | | using NSH) | | | | | | using NSH) | | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| | Extensibility |Variable length |Extension fields| No native ext- | | | Extensibility |Variable length |Extension fields| No native ext- | | |||
| skipping to change at page 17, line 31 ¶ | skipping to change at page 22, line 55 ¶ | |||
| | indicator | | | | | | indicator | | | | | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| | OAM / control | OAM bit | Control bit | OAM bit | | | OAM / control | OAM bit | Control bit | OAM bit | | |||
| | field | | | | | | field | | | | | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| | Version field | 2 bits | 2 bits | 2 bits | | | Version field | 2 bits | 2 bits | 2 bits | | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| | Reserved bits | 14 bits | - | 27 bits | | | Reserved bits | 14 bits | - | 27 bits | | |||
| +----------------+----------------+----------------+----------------+ | +----------------+----------------+----------------+----------------+ | |||
| Figure 1: NVO3 Encapsulation Comparison | Figure 1: NVO3 Encapsulations Comparison | |||
| 12. Contributors | ||||
| the following co-authors have contributed to this document. | ||||
| Ilango Ganga Intel Email: ilango.s.ganga@intel.com | ||||
| Pankaj Garg Microsoft Email: pankajg@microsoft.com | ||||
| Rajeev Manur Broadcom Email: rajeev.manur@broadcom.com | ||||
| Tal Mizrahi Marvell Email: talmi@marvell.com | ||||
| David Mozes Email: mosesster@gmail.com | ||||
| Erik Nordmark Email: nordmark@sonic.net | Internet-Draft NVO3 Encapsulation Considerations | |||
| Michael Smith Cisco Email: michsmit@cisco.com | Contributors | |||
| Sam Aldrin Google Email: aldrin.ietf@gmail.com | ||||
| Ignas Bagdonas Equinix Email: ibagdona.ietf@gmail.com | The following co-authors have contributed to this document: | |||
| 13. References | Ilango Ganga Intel Email: ilango.s.ganga@intel.com | |||
| 13.1. Normative References | Pankaj Garg Microsoft Email: pankajg@microsoft.com | |||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | Rajeev Manur Broadcom Email: rajeev.manur@broadcom.com | |||
| Requirement Levels", BCP 14, RFC 2119, | ||||
| DOI 10.17487/RFC2119, March 1997, | ||||
| <https://www.rfc-editor.org/info/rfc2119>. | ||||
| 13.2. Informative References | Tal Mizrahi Marvell Email: talmi@marvell.com | |||
| [I-D.herbert-gue-extensions] | David Mozes Email: mosesster@gmail.com | |||
| Herbert, T., Yong, L., and F. Templin, "Extensions for | ||||
| Generic UDP Encapsulation", draft-herbert-gue- | ||||
| extensions-01 (work in progress), October 2016. | ||||
| [I-D.ietf-nvo3-geneve] | Erik Nordmark Email: nordmark@sonic.net | |||
| Gross, J., Ganga, I., and T. Sridhar, "Geneve: Generic | ||||
| Network Virtualization Encapsulation", draft-ietf- | ||||
| nvo3-geneve-14 (work in progress), September 2019. | ||||
| [I-D.ietf-nvo3-gue] | Michael Smith Cisco Email: michsmit@cisco.com | |||
| Herbert, T., Yong, L., and O. Zia, "Generic UDP | ||||
| Encapsulation", draft-ietf-nvo3-gue-05 (work in progress), | ||||
| October 2016. | ||||
| [I-D.ietf-nvo3-vxlan-gpe] | Sam Aldrin Google Email: aldrin.ietf@gmail.com | |||
| Maino, F., Kreeger, L., and U. Elzur, "Generic Protocol | ||||
| Extension for VXLAN", draft-ietf-nvo3-vxlan-gpe-09 (work | ||||
| in progress), December 2019. | ||||
| [I-D.smith-vxlan-group-policy] | Ignas Bagdonas Equinix Email: ibagdona.ietf@gmail.com | |||
| Smith, M. and L. Kreeger, "VXLAN Group Policy Option", | ||||
| draft-smith-vxlan-group-policy-05 (work in progress), | ||||
| October 2018. | ||||
| [RFC2418] Bradner, S., "IETF Working Group Guidelines and | Internet-Draft NVO3 Encapsulation Considerations | |||
| Procedures", BCP 25, RFC 2418, DOI 10.17487/RFC2418, | ||||
| September 1998, <https://www.rfc-editor.org/info/rfc2418>. | ||||
| [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation | Authors' Addresses | |||
| Edge-to-Edge (PWE3) Architecture", RFC 3985, | ||||
| DOI 10.17487/RFC3985, March 2005, | ||||
| <https://www.rfc-editor.org/info/rfc3985>. | ||||
| [RFC8300] Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., | Sami Boutros (editor) | |||
| "Network Service Header (NSH)", RFC 8300, | Ciena | |||
| DOI 10.17487/RFC8300, January 2018, | USA | |||
| <https://www.rfc-editor.org/info/rfc8300>. | ||||
| Author's Address | Email: sboutros@ciena.com | |||
| Sami Boutros (editor) | Donald E. Eastlake, 3rd (editor) | |||
| Ciena | Futurewei Technologies | |||
| USA | 2386 Panoramic Circle | |||
| Apopka, FL 32703 | ||||
| USA | ||||
| Email: sboutros@ciena.com | Tel: +1-508-333-2270 | |||
| Email: d3e3e3@gmail.com | ||||
| End of changes. 161 change blocks. | ||||
| 332 lines changed or deleted | 413 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||