| < draft-ietf-nvo3-mcast-framework-09.txt | draft-ietf-nvo3-mcast-framework-10.txt > | |||
|---|---|---|---|---|
| NVO3 working group A. Ghanwani | NVO3 working group A. Ghanwani | |||
| Internet Draft Dell | Internet Draft Dell | |||
| Intended status: Informational L. Dunbar | Intended status: Informational L. Dunbar | |||
| Expires: November 8, 2017 M. McBride | Expires: November 8, 2018 M. McBride | |||
| Huawei | Huawei | |||
| V. Bannai | V. Bannai | |||
| R. Krishnan | R. Krishnan | |||
| Dell | Dell | |||
| June 23, 2017 | October 5, 2017 | |||
| A Framework for Multicast in Network Virtualization Overlays | A Framework for Multicast in Network Virtualization Overlays | |||
| draft-ietf-nvo3-mcast-framework-09 | draft-ietf-nvo3-mcast-framework-10 | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. This document may not be modified, | provisions of BCP 78 and BCP 79. This document may not be modified, | |||
| and derivative works of it may not be created, except to publish it | and derivative works of it may not be created, except to publish it | |||
| as an RFC and to translate it into languages other than English. | as an RFC and to translate it into languages other than English. | |||
| skipping to change at page 2, line 5 ¶ | skipping to change at page 2, line 5 ¶ | |||
| reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
| This Internet-Draft will expire on November 8, 2016. | This Internet-Draft will expire on November 8, 2016. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with | carefully, as they describe your rights and restrictions with | |||
| respect to this document. Code Components extracted from this | respect to this document. Code Components extracted from this | |||
| document must include Simplified BSD License text as described in | document must include Simplified BSD License text as described in | |||
| Section 4.e of the Trust Legal Provisions and are provided without | Section 4.e of the Trust Legal Provisions and are provided without | |||
| warranty as described in the Simplified BSD License. | warranty as described in the Simplified BSD License. | |||
| Abstract | Abstract | |||
| This document discusses a framework of supporting multicast traffic | This document provides a framework of supporting multicast traffic | |||
| in a network that uses Network Virtualization Overlays (NVO3). Both | in a network that uses Network Virtualization Overlays (NVO3). Both | |||
| infrastructure multicast and application-specific multicast are | infrastructure multicast and application-specific multicast are | |||
| discussed. It describes the various mechanisms that can be used for | discussed. It describes the various mechanisms that can be used for | |||
| delivering such traffic as well as the data plane and control plane | delivering such traffic as well as the data plane and control plane | |||
| considerations for each of the mechanisms. | considerations for each of the mechanisms. | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction...................................................3 | 1. Introduction...................................................3 | |||
| 1.1. Infrastructure multicast..................................3 | 1.1. Infrastructure multicast..................................3 | |||
| skipping to change at page 2, line 44 ¶ | skipping to change at page 2, line 46 ¶ | |||
| 1.3. Terminology clarification.................................4 | 1.3. Terminology clarification.................................4 | |||
| 2. Acronyms.......................................................4 | 2. Acronyms.......................................................4 | |||
| 3. Multicast mechanisms in networks that use NVO3.................5 | 3. Multicast mechanisms in networks that use NVO3.................5 | |||
| 3.1. No multicast support......................................6 | 3.1. No multicast support......................................6 | |||
| 3.2. Replication at the source NVE.............................6 | 3.2. Replication at the source NVE.............................6 | |||
| 3.3. Replication at a multicast service node...................9 | 3.3. Replication at a multicast service node...................9 | |||
| 3.4. IP multicast in the underlay.............................10 | 3.4. IP multicast in the underlay.............................10 | |||
| 3.5. Other schemes............................................12 | 3.5. Other schemes............................................12 | |||
| 4. Simultaneous use of more than one mechanism...................12 | 4. Simultaneous use of more than one mechanism...................12 | |||
| 5. Other issues..................................................12 | 5. Other issues..................................................12 | |||
| 5.1. Multicast-agnostic NVEs..................................12 | 5.1. Multicast-agnostic NVEs..................................13 | |||
| 5.2. Multicast membership management for DC with VMs..........13 | 5.2. Multicast membership management for DC with VMs..........13 | |||
| 6. Summary.......................................................14 | 6. Summary.......................................................14 | |||
| 7. Security Considerations.......................................14 | 7. Security Considerations.......................................14 | |||
| 8. IANA Considerations...........................................14 | 8. IANA Considerations...........................................14 | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| 9. References....................................................14 | 9. References....................................................14 | |||
| 9.1. Normative References.....................................14 | 9.1. Normative References.....................................14 | |||
| 9.2. Informative References...................................15 | 9.2. Informative References...................................15 | |||
| 10. Acknowledgments..............................................16 | 10. Acknowledgments..............................................16 | |||
| 1. Introduction | 1. Introduction | |||
| Network virtualization using Overlays over Layer 3 (NVO3) is a | Network virtualization using Overlays over Layer 3 (NVO3)[RFC7365] | |||
| technology that is used to address issues that arise in building | is a technology that is used to address issues that arise in | |||
| large, multitenant data centers that make extensive use of server | building large, multitenant data centers that make extensive use of | |||
| virtualization [RFC7364]. | server virtualization [RFC7364]. | |||
| This document provides a framework for supporting multicast traffic, | This document provides a framework for supporting multicast traffic, | |||
| in a network that uses Network Virtualization using Overlays over | in a network that uses Network Virtualization using Overlays over | |||
| Layer 3 (NVO3). Both infrastructure multicast and application- | Layer 3 (NVO3). Both infrastructure multicast and application- | |||
| specific multicast are considered. It describes the various | specific multicast are considered. It describes the various | |||
| mechanisms and considerations that can be used for delivering such | mechanisms and considerations that can be used for delivering such | |||
| traffic in networks that use NVO3. | traffic in networks that use NVO3. | |||
| The reader is assumed to be familiar with the terminology as defined | The reader is assumed to be familiar with the terminology as defined | |||
| in the NVO3 Framework document [RFC7365] and NVO3 Architecture | in the NVO3 Framework document [RFC7365] and NVO3 Architecture | |||
| document [RFC8014]. | document [RFC8014]. | |||
| 1.1. Infrastructure multicast | 1.1. Infrastructure multicast | |||
| Infrastructure multicast is a capability needed by networking | Infrastructure multicast is a capability needed by networking | |||
| services, such as Address Resolution Protocol (ARP), Neighbor | services, such as Address Resolution Protocol (ARP), Neighbor | |||
| Discovery (ND), Dynamic Host Configuration Protocol (DHCP), | Discovery (ND), Dynamic Host Configuration Protocol (DHCP), | |||
| multicast Domain Name Server (mDNS), etc.. RFC3819 Section 5 and 6 | multicast Domain Name Server (mDNS), etc. RFC3819 Section 5 and 6 | |||
| have detailed description for some of the infrastructure multicast | have detailed description for some of the infrastructure multicast | |||
| [RFC3819]. It is possible to provide solutions for these that do | [RFC3819]. It is possible to provide solutions for these that do | |||
| not involve multicast in the underlay network. In the case of | not involve multicast in the underlay network. In the case of | |||
| ARP/ND, a network virtualization authority (NVA) can be used for | ARP/ND, a network virtualization authority (NVA) can be used for | |||
| distributing the mappings of IP address to MAC address to all | distributing the mappings of IP address to MAC address to all | |||
| network virtualization edges (NVEs). The NVEs can then trap ARP | network virtualization edges (NVEs). The NVEs can then trap ARP | |||
| Request/ND Neighbor Solicitation messages from the TSs that are | Request/ND Neighbor Solicitation messages from the TSs (Tenant | |||
| attached to it and respond to them, thereby eliminating the need to | System) that are attached to it and respond to them, thereby | |||
| for broadcast/multicast of such messages. In the case of DHCP, the | eliminating the need to for broadcast/multicast of such messages. | |||
| NVE can be configured to forward these messages using a helper | In the case of DHCP, the NVE can be configured to forward these | |||
| function. | messages using a helper function. | |||
| Of course it is possible to support all of these infrastructure | Of course it is possible to support all of these infrastructure | |||
| multicast protocols natively if the underlay provides multicast | multicast protocols natively if the underlay provides multicast | |||
| transport. However, even in the presence of multicast transport, it | transport. However, even in the presence of multicast transport, it | |||
| may be beneficial to use the optimizations mentioned above to reduce | may be beneficial to use the optimizations mentioned above to reduce | |||
| the amount of such traffic in the network. | the amount of such traffic in the network. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| 1.2. Application-specific multicast | 1.2. Application-specific multicast | |||
| Application-specific multicast traffic are originated and consumed | Application-specific multicast traffic are originated and consumed | |||
| by user applications. The Application-specific multicast, which can | by user applications. The Application-specific multicast, which can | |||
| be either Source-Specific Multicast (SSM) or Any-Source Multicast | be either Source-Specific Multicast (SSM) or Any-Source Multicast | |||
| (ASM)[RFC3569], has the following characteristics: | (ASM)[RFC3569], has the following characteristics: | |||
| 1. Receiver hosts are expected to subscribe to multicast content | 1. Receiver hosts are expected to subscribe to multicast content | |||
| using protocols such as IGMP [RFC3376] (IPv4) or MLD (IPv6). | using protocols such as IGMP [RFC3376] (IPv4) or MLD [RFC2710] | |||
| Multicast sources and listeners participant in these protocols | (IPv6). Multicast sources and listeners participant in these | |||
| using addresses that are in the Tenant System address domain. | protocols using addresses that are in the Tenant System address | |||
| domain. | ||||
| 2. The list of multicast listeners for each multicast group is not | 2. The list of multicast listeners for each multicast group is not | |||
| known in advance. Therefore, it may not be possible for an NVA | known in advance. Therefore, it may not be possible for an NVA | |||
| to get the list of participants for each multicast group ahead | to get the list of participants for each multicast group ahead | |||
| of time. | of time. | |||
| 1.3. Terminology clarification | 1.3. Terminology clarification | |||
| 2. Acronyms & Terminology | ||||
| In this document, the terms host, tenant system (TS) and virtual | In this document, the terms host, tenant system (TS) and virtual | |||
| machine (VM) are used interchangeably to represent an end station | machine (VM) are used interchangeably to represent an end station | |||
| that originates or consumes data packets. | that originates or consumes data packets. | |||
| 2. Acronyms | ||||
| ASM: Any-Source Multicast | ASM: Any-Source Multicast | |||
| IGMP: Internet Group Management Protocol | IGMP: Internet Group Management Protocol | |||
| LISP: Locator/ID Separation Protocol | LISP: Locator/ID Separation Protocol | |||
| MSN: Multicast Service Node | MSN: Multicast Service Node | |||
| RLOC: Routing Locator | RLOC: Routing Locator | |||
| NVA: Network Virtualization Authority | NVA: Network Virtualization Authority | |||
| NVE: Network Virtualization Edge | NVE: Network Virtualization Edge | |||
| NVGRE: Network Virtualization using GRE | NVGRE: Network Virtualization using GRE | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| PIM: Protocol-Independent Multicast | PIM: Protocol-Independent Multicast | |||
| SSM: Source-Specific Multicast | SSM: Source-Specific Multicast | |||
| TS: Tenant system | TS: Tenant system | |||
| VM: Virtual Machine | VM: Virtual Machine | |||
| VN: Virtual Network | VN: Virtual Network | |||
| VTEP: VxLAN Tunnel End Points | VTEP: VxLAN Tunnel End Points | |||
| VXLAN: Virtual eXtensible LAN | VXLAN: Virtual eXtensible LAN | |||
| 3. Multicast mechanisms in networks that use NVO3 | 3. Multicast mechanisms in networks that use NVO3 | |||
| In NVO3 environments, traffic between NVEs is transported using an | In NVO3 environments, traffic between NVEs is transported using an | |||
| encapsulation such as Virtual eXtensible Local Area Network (VXLAN) | encapsulation such as Virtual eXtensible Local Area Network (VXLAN) | |||
| [RFC7348,VXLAN-GPE], Network Virtualization Using Generic Routing | [RFC7348,VXLAN-GPE], Network Virtualization Using Generic Routing | |||
| Encapsulation (NVGRE) [RFC7637], , Geneve [Geneve], Generic UDP | Encapsulation (NVGRE) [RFC7637], Geneve [Geneve], Generic UDP | |||
| Encapsulation (GUE) [GUE], etc. | Encapsulation (GUE) [GUE], etc. | |||
| What makes NVO3 different from any other network is that some NVEs, | What makes NVO3 different from any other network is that some NVEs, | |||
| especially the NVE implemented on server, might not support PIM or | especially the NVE implemented on server, might not support PIM or | |||
| other native multicast mechanisms. They might just encapsulate the | other native multicast mechanisms. They might just encapsulate the | |||
| data packets from VMs with an outer unicast header. Therefore, it is | data packets from VMs with an outer unicast header. Therefore, it is | |||
| important for networks using NVO3 to have mechanisms to support | important for networks using NVO3 to have mechanisms to support | |||
| multicast as a network capability for NVEs, to map multicast traffic | multicast as a network capability for NVEs, to map multicast traffic | |||
| from VMs (users/applications) to an equivalent multicast capability | from VMs (users/applications) to an equivalent multicast capability | |||
| inside the NVE, or to figure out the outer destination address if | inside the NVE, or to figure out the outer destination address if | |||
| skipping to change at page 6, line 5 ¶ | skipping to change at page 6, line 5 ¶ | |||
| the attributes of the following four methods: | the attributes of the following four methods: | |||
| 1. No multicast support. | 1. No multicast support. | |||
| 2. Replication at the source NVE. | 2. Replication at the source NVE. | |||
| 3. Replication at a multicast service node. | 3. Replication at a multicast service node. | |||
| 4. IP multicast in the underlay. | 4. IP multicast in the underlay. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| These methods are briefly mentioned in the NVO3 Framework [RFC7365] | These methods are briefly mentioned in the NVO3 Framework [RFC7365] | |||
| and NVO3 architecture [RFC8014] document. This document provides | and NVO3 architecture [RFC8014] document. This document provides | |||
| more details about the basic mechanisms underlying each of these | more details about the basic mechanisms underlying each of these | |||
| methods and discusses the issues and tradeoffs of each. | methods and discusses the issues and trade-offs of each. | |||
| We note that other methods are also possible, such as [EDGE-REP], | We note that other methods are also possible, such as [EDGE-REP], | |||
| but we focus on the above four because they are the most common. | but we focus on the above four because they are the most common. | |||
| 3.1. No multicast support | 3.1. No multicast support | |||
| In this scenario, there is no support whatsoever for multicast | In this scenario, there is no support whatsoever for multicast | |||
| traffic when using the overlay. This method can only work if the | traffic when using the overlay. This method can only work if the | |||
| following conditions are met: | following conditions are met: | |||
| skipping to change at page 7, line 4 ¶ | skipping to change at page 7, line 4 ¶ | |||
| is a problem in the case where the NVE is implemented in a physical | is a problem in the case where the NVE is implemented in a physical | |||
| switch and the TS is a physical end station that has not registered | switch and the TS is a physical end station that has not registered | |||
| with the NVA. | with the NVA. | |||
| 3.2. Replication at the source NVE | 3.2. Replication at the source NVE | |||
| With this method, the overlay attempts to provide a multicast | With this method, the overlay attempts to provide a multicast | |||
| service without requiring any specific support from the underlay, | service without requiring any specific support from the underlay, | |||
| other than that of a unicast service. A multicast or broadcast | other than that of a unicast service. A multicast or broadcast | |||
| transmission is achieved by replicating the packet at the source | transmission is achieved by replicating the packet at the source | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| NVE, and making copies, one for each destination NVE that the | NVE, and making copies, one for each destination NVE that the | |||
| multicast packet must be sent to. | multicast packet must be sent to. | |||
| For this mechanism to work, the source NVE must know, a priori, the | For this mechanism to work, the source NVE must know, a priori, the | |||
| IP addresses of all destination NVEs that need to receive the | IP addresses of all destination NVEs that need to receive the | |||
| packet. For the purpose of ARP/ND, this would involve knowing the | packet. For the purpose of ARP/ND, this would involve knowing the | |||
| IP addresses of all the NVEs that have TSs in the virtual network | IP addresses of all the NVEs that have TSs in the virtual network | |||
| (VN) of the TS that generated the request. For the support of | (VN) of the TS that generated the request. For the support of | |||
| application-specific multicast traffic, a method similar to that of | application-specific multicast traffic, a method similar to that of | |||
| receiver-sites registration for a particular multicast group | receiver-sites registration for a particular multicast group | |||
| skipping to change at page 7, line 48 ¶ | skipping to change at page 7, line 51 ¶ | |||
| In multi-homing environments, i.e., in those where a TS is attached | In multi-homing environments, i.e., in those where a TS is attached | |||
| to more than one NVE, the NVA would be expected to provide | to more than one NVE, the NVA would be expected to provide | |||
| information to all of the NVEs under its control about all of the | information to all of the NVEs under its control about all of the | |||
| NVEs to which such a TS is attached. The ingress NVE can choose any | NVEs to which such a TS is attached. The ingress NVE can choose any | |||
| one of the egress NVEs for the data frames destined towards the TS. | one of the egress NVEs for the data frames destined towards the TS. | |||
| This method requires multiple copies of the same packet to all NVEs | This method requires multiple copies of the same packet to all NVEs | |||
| that participate in the VN. If, for example, a tenant subnet is | that participate in the VN. If, for example, a tenant subnet is | |||
| spread across 50 NVEs, the packet would have to be replicated 50 | spread across 50 NVEs, the packet would have to be replicated 50 | |||
| times at the source NVE. This also creates an issue with the | times at the source NVE. Obviously, this approach creates more | |||
| forwarding performance of the NVE. | traffic to the network that can cause congestion when the network | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| load is high. This also creates an issue with the forwarding | ||||
| performance of the NVE. | ||||
| Note that this method is similar to what was used in Virtual Private | Note that this method is similar to what was used in Virtual Private | |||
| LAN Service (VPLS) [RFC4762] prior to support of Multi-Protocol | LAN Service (VPLS) [RFC4762] prior to support of Multi-Protocol | |||
| Label Switching (MPLS) multicast [RFC7117]. While there are some | Label Switching (MPLS) multicast [RFC7117]. While there are some | |||
| similarities between MPLS Virtual Private Network (VPN) and NVO3, | similarities between MPLS Virtual Private Network (VPN) and NVO3, | |||
| there are some key differences: | there are some key differences: | |||
| - The Customer Edge (CE) to Provider Edge (PE) attachment in VPNs is | - The Customer Edge (CE) to Provider Edge (PE) attachment in VPNs is | |||
| somewhat static, whereas in a DC that allows VMs to migrate | somewhat static, whereas in a DC that allows VMs to migrate | |||
| anywhere, the TS attachment to NVE is much more dynamic. | anywhere, the TS attachment to NVE is much more dynamic. | |||
| - The number of PEs to which a single VPN customer is attached in | - The number of PEs to which a single VPN customer is attached in | |||
| an MPLS VPN environment is normally far less than the number of | an MPLS VPN environment is normally far less than the number of | |||
| NVEs to which a VN's VMs are attached in a DC. | NVEs to which a VN's VMs are attached in a DC. | |||
| When a VPN customer has multiple multicast groups, [RFC6513] | When a VPN customer has multiple multicast groups, "Multicast VPN" | |||
| "Multicast VPN" combines all those multicast groups within each | [RFC6513] combines all those multicast groups within each VPN | |||
| VPN client to one single multicast group in the MPLS (or VPN) | client to one single multicast group in the MPLS (or VPN) core. | |||
| core. The result is that messages from any of the multicast | The result is that messages from any of the multicast groups | |||
| groups belonging to one VPN customer will reach all the PE nodes | belonging to one VPN customer will reach all the PE nodes of the | |||
| of the client. In other words, any messages belonging to any | client. In other words, any messages belonging to any multicast | |||
| multicast groups under customer X will reach all PEs of the | groups under customer X will reach all PEs of the customer X. When | |||
| customer X. When the customer X is attached to only a handful of | the customer X is attached to only a handful of PEs, the use of | |||
| PEs, the use of this approach does not result in excessive wastage | this approach does not result in excessive wastage of bandwidth in | |||
| of bandwidth in the provider's network. | the provider's network. | |||
| In a DC environment, a typical server/hypervisor based virtual | In a DC environment, a typical server/hypervisor based virtual | |||
| switch may only support 10's VMs (as of this writing). A subnet | switch may only support 10's VMs (as of this writing). A subnet | |||
| with N VMs may be, in the worst case, spread across N vSwitches. | with N VMs may be, in the worst case, spread across N vSwitches. | |||
| Using "MPLS VPN multicast" approach in such a scenario would | Using "MPLS VPN multicast" approach in such a scenario would | |||
| require the creation of a Multicast group in the core for this VN | require the creation of a Multicast group in the core for this VN | |||
| to reach all N NVEs. If only small percentage of this client's VMs | to reach all N NVEs. If only small percentage of this client's VMs | |||
| participate in application specific multicast, a great number of | participate in application specific multicast, a great number of | |||
| NVEs will receive multicast traffic that is not forwarded to any | NVEs will receive multicast traffic that is not forwarded to any | |||
| of their attached VMs, resulting in considerable wastage of | of their attached VMs, resulting in considerable wastage of | |||
| bandwidth. | bandwidth. | |||
| Therefore, the Multicast VPN solution may not scale in DC | Therefore, the Multicast VPN solution may not scale in DC | |||
| environment with dynamic attachment of Virtual Networks to NVEs and | environment with dynamic attachment of Virtual Networks to NVEs and | |||
| greater number of NVEs for each virtual network. | greater number of NVEs for each virtual network. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| 3.3. Replication at a multicast service node | 3.3. Replication at a multicast service node | |||
| With this method, all multicast packets would be sent using a | With this method, all multicast packets would be sent using a | |||
| unicast tunnel encapsulation from the ingress NVE to a multicast | unicast tunnel encapsulation from the ingress NVE to a multicast | |||
| service node (MSN). The MSN, in turn, would create multiple copies | service node (MSN). The MSN, in turn, would create multiple copies | |||
| of the packet and would deliver a copy, using a unicast tunnel | of the packet and would deliver a copy, using a unicast tunnel | |||
| encapsulation, to each of the NVEs that are part of the multicast | encapsulation, to each of the NVEs that are part of the multicast | |||
| group for which the packet is intended. | group for which the packet is intended. | |||
| This mechanism is similar to that used by the Asynchronous Transfer | This mechanism is similar to that used by the Asynchronous Transfer | |||
| Mode (ATM) Forum's LAN Emulation (LANE)LANE specification [LANE]. | Mode (ATM) Forum's LAN Emulation (LANE) specification [LANE]. The | |||
| The MSN is similar to the RP in PIM SM, but different in that the | MSN is similar to the RP (Rendezvous Point) in PIM SM, but different | |||
| user data traffic are carried by the NVO3 tunnels. | in that the user data traffic are carried by the NVO3 tunnels. | |||
| The following are the possible ways for the MSN to get the | The following are the possible ways for the MSN to get the | |||
| membership information for each multicast group: | membership information for each multicast group: | |||
| - The MSN can obtain this membership information from the IGMP/MLD | - The MSN can obtain this membership information from the IGMP/MLD | |||
| report messages sent by TSs in response to IGMP/MLD query messages | report messages sent by TSs in response to IGMP/MLD query messages | |||
| from the MSN. The IGMP/MLD query messages are sent from the MSN to | from the MSN. The IGMP/MLD query messages are sent from the MSN to | |||
| the NVEs, which then forward the query messages to TSs attached to | the NVEs, which then forward the query messages to TSs attached to | |||
| them. An IGMP/MLD query messages sent out by the MSN to an NVE is | them. An IGMP/MLD query messages sent out by the MSN to an NVE is | |||
| encapsulated with the MSN address in the outer source address | encapsulated with the MSN address in the outer source address | |||
| skipping to change at page 9, line 43 ¶ | skipping to change at page 10, line 5 ¶ | |||
| establishes a mapping "MSN address" <-> "multicast address", | establishes a mapping "MSN address" <-> "multicast address", | |||
| decapsulates the received encapsulated IGMP/MLD message, and | decapsulates the received encapsulated IGMP/MLD message, and | |||
| multicast the decapsulated query message to TSs that belong to the | multicast the decapsulated query message to TSs that belong to the | |||
| VN under the NVE. A IGMP/MLD report message sent by a TS includes | VN under the NVE. A IGMP/MLD report message sent by a TS includes | |||
| the multicast address and the address of the TS. With the proper | the multicast address and the address of the TS. With the proper | |||
| "MSN Address" <-> "Multicast-Address" mapping, the NVEs can | "MSN Address" <-> "Multicast-Address" mapping, the NVEs can | |||
| encapsulate all multicast data frames to the "Multicast-Address" | encapsulate all multicast data frames to the "Multicast-Address" | |||
| with the address of the MSN in the outer destination address | with the address of the MSN in the outer destination address | |||
| field. | field. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| - The MSN can obtain the membership information from the NVEs that | - The MSN can obtain the membership information from the NVEs that | |||
| have the capability to establish multicast groups by snooping | have the capability to establish multicast groups by snooping | |||
| native IGMP/MLD messages (p.s. the communication must be specific | native IGMP/MLD messages (p.s. the communication must be specific | |||
| to the multicast addresses), or by having the NVA obtain the | to the multicast addresses), or by having the NVA obtain the | |||
| information from the NVEs, and in turn have MSN communicate with | information from the NVEs, and in turn have MSN communicate with | |||
| the NVA. This approach requires additional protocol between MSN | the NVA. This approach requires additional protocol between MSN | |||
| and NVEs. | and NVEs. | |||
| Unlike the method described in Section 3.2, there is no performance | Unlike the method described in Section 3.2, there is no performance | |||
| impact at the ingress NVE, nor are there any issues with multiple | impact at the ingress NVE, nor are there any issues with multiple | |||
| skipping to change at page 10, line 41 ¶ | skipping to change at page 11, line 5 ¶ | |||
| dependent multicast, such as [ISIS-Multicast]. | dependent multicast, such as [ISIS-Multicast]. | |||
| If an NVE connects to its attached TSs via a Layer 2 network, there | If an NVE connects to its attached TSs via a Layer 2 network, there | |||
| are multiple ways for NVEs to support the application specific | are multiple ways for NVEs to support the application specific | |||
| multicast: | multicast: | |||
| - The NVE only supports the basic IGMP/MLD snooping function, let | - The NVE only supports the basic IGMP/MLD snooping function, let | |||
| the TSs routers handling the application specific multicast. This | the TSs routers handling the application specific multicast. This | |||
| scheme doesn't utilize the underlay IP multicast protocols. | scheme doesn't utilize the underlay IP multicast protocols. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| - The NVE can act as a pseudo multicast router for the directly | - The NVE can act as a pseudo multicast router for the directly | |||
| attached VMs and support proper mapping of IGMP/MLD's messages to | attached VMs and support proper mapping of IGMP/MLD's messages to | |||
| the messages needed by the underlay IP multicast protocols. | the messages needed by the underlay IP multicast protocols. | |||
| With this method, there are none of the issues with the methods | With this method, there are none of the issues with the methods | |||
| described in Sections 3.2. | described in Sections 3.2. | |||
| With PIM Sparse Mode (PIM-SM), the number of flows required would be | With PIM Sparse Mode (PIM-SM), the number of flows required would be | |||
| (n*g), where n is the number of source NVEs that source packets for | (n*g), where n is the number of source NVEs that source packets for | |||
| the group, and g is the number of groups. Bidirectional PIM (BIDIR- | the group, and g is the number of groups. Bidirectional PIM (BIDIR- | |||
| PIM) would offer better scalability with the number of flows | PIM) would offer better scalability with the number of flows | |||
| required being g. Unfortunately, many vendors still do not fully | required being g. Unfortunately, many vendors still do not fully | |||
| support BIDIR or have limitations on its implementaion. RFC6831 | support BIDIR or have limitations on its implementation. RFC6831 | |||
| [RFC6831] has good description of using SSM as an alternative to | [RFC6831] has good description of using SSM as an alternative to | |||
| BIDIR if the VTEP/NVE devices have a way to learn of each other's IP | BIDIR if the VTEP/NVE devices have a way to learn of each other's IP | |||
| address so that they could join all SSM SPT's to create/maintain an | address so that they could join all SSM SPT's to create/maintain an | |||
| underlay SSM IP Multicast tunnel solution. | underlay SSM IP Multicast tunnel solution. | |||
| In the absence of any additional mechanism, e.g. using an NVA for | In the absence of any additional mechanism, e.g. using an NVA for | |||
| address resolution, for optimal delivery, there would have to be a | address resolution, for optimal delivery, there would have to be a | |||
| separate group for each tenant, plus a separate group for each | separate group for each tenant, plus a separate group for each | |||
| multicast address (used for multicast applications) within a tenant. | multicast address (used for multicast applications) within a tenant. | |||
| skipping to change at page 11, line 37 ¶ | skipping to change at page 11, line 43 ¶ | |||
| multicasts at Layer 2, there will be some aliasing. Finally, a | multicasts at Layer 2, there will be some aliasing. Finally, a | |||
| mechanism to efficiently provision such addresses for each group | mechanism to efficiently provision such addresses for each group | |||
| would be required. | would be required. | |||
| There are additional optimizations which are possible, but they come | There are additional optimizations which are possible, but they come | |||
| with their own restrictions. For example, a set of tenants may be | with their own restrictions. For example, a set of tenants may be | |||
| restricted to some subset of NVEs and they could all share the same | restricted to some subset of NVEs and they could all share the same | |||
| outer IP multicast group address. This however introduces a problem | outer IP multicast group address. This however introduces a problem | |||
| of sub-optimal delivery (even if a particular tenant within the | of sub-optimal delivery (even if a particular tenant within the | |||
| group of tenants doesn't have a presence on one of the NVEs which | group of tenants doesn't have a presence on one of the NVEs which | |||
| another one does, the former's multicast packets would still be | another one does, the multicast packets would still be delivered to | |||
| delivered to that NVE). It also introduces an additional network | that NVE). It also introduces an additional network management | |||
| management burden to optimize which tenants should be part of the | burden to optimize which tenants should be part of the same tenant | |||
| same tenant group (based on the NVEs they share), which somewhat | group (based on the NVEs they share), which somewhat dilutes the | |||
| dilutes the value proposition of NVO3 which is to completely | ||||
| decouple the overlay and physical network design allowing complete | Internet-Draft A framework for multicast in NVO3 | |||
| freedom of placement of VMs anywhere within the data center. | ||||
| value proposition of NVO3 which is to completely decouple the | ||||
| overlay and physical network design allowing complete freedom of | ||||
| placement of VMs anywhere within the data center. | ||||
| Multicast schemes such as BIER (Bit Indexed Explicit Replication) | Multicast schemes such as BIER (Bit Indexed Explicit Replication) | |||
| [BIER-ARCH] may be able to provide optimizations by allowing the | [BIER-ARCH] may be able to provide optimizations by allowing the | |||
| underlay network to provide optimum multicast delivery without | underlay network to provide optimum multicast delivery without | |||
| requiring routers in the core of the network to maintain per- | requiring routers in the core of the network to maintain per- | |||
| multicast group state. | multicast group state. | |||
| 3.5. Other schemes | 3.5. Other schemes | |||
| There are still other mechanisms that may be used that attempt to | There are still other mechanisms that may be used that attempt to | |||
| combine some of the advantages of the above methods by offering | combine some of the advantages of the above methods by offering | |||
| multiple replication points, each with a limited degree of | multiple replication points, each with a limited degree of | |||
| replication [EDGE-REP]. Such schemes offer a trade-off between the | replication [EDGE-REP]. Such schemes offer a trade-off between the | |||
| amount of replication at an intermediate node (router) versus | amount of replication at an intermediate node (e.g. router) versus | |||
| performing all of the replication at the source NVE or all of the | performing all of the replication at the source NVE or all of the | |||
| replication at a multicast service node. | replication at a multicast service node. | |||
| 4. Simultaneous use of more than one mechanism | 4. Simultaneous use of more than one mechanism | |||
| While the mechanisms discussed in the previous section have been | While the mechanisms discussed in the previous section have been | |||
| discussed individually, it is possible for implementations to rely | discussed individually, it is possible for implementations to rely | |||
| on more than one of these. For example, the method of Section 3.1 | on more than one of these. For example, the method of Section 3.1 | |||
| could be used for minimizing ARP/ND, while at the same time, | could be used for minimizing ARP/ND, while at the same time, | |||
| multicast applications may be supported by one, or a combination of, | multicast applications may be supported by one, or a combination of, | |||
| the other methods. For small multicast groups, the methods of | the other methods. For small multicast groups, the methods of | |||
| source NVE replication or the use of a multicast service node may be | source NVE replication or the use of a multicast service node may be | |||
| attractive, while for larger multicast groups, the use of multicast | attractive, while for larger multicast groups, the use of multicast | |||
| in the underlay may be preferable. | in the underlay may be preferable. | |||
| 5. Other issues | 5. Other issues | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| 5.1. Multicast-agnostic NVEs | 5.1. Multicast-agnostic NVEs | |||
| Some hypervisor-based NVEs do not process or recognize IGMP/MLD | Some hypervisor-based NVEs do not process or recognize IGMP/MLD | |||
| frames; i.e. those NVEs simply encapsulate the IGMP/MLD messages in | frames; i.e. those NVEs simply encapsulate the IGMP/MLD messages in | |||
| the same way as they do for regular data frames. | the same way as they do for regular data frames. | |||
| By default, TSs router periodically sends IGMP/MLD query messages to | By default, TSs router periodically sends IGMP/MLD query messages to | |||
| all the hosts in the subnet to trigger the hosts that are interested | all the hosts in the subnet to trigger the hosts that are interested | |||
| in the multicast stream to send back IGMP/MLD reports. In order for | in the multicast stream to send back IGMP/MLD reports. In order for | |||
| skipping to change at page 14, line 5 ¶ | skipping to change at page 14, line 5 ¶ | |||
| When a VM is deleted from an NVE or a new VM is added to an NVE, the | When a VM is deleted from an NVE or a new VM is added to an NVE, the | |||
| VM management system should notify the MSN to send the IGMP/MLD | VM management system should notify the MSN to send the IGMP/MLD | |||
| query messages to the relevant NVEs (as described in Section 3.3), | query messages to the relevant NVEs (as described in Section 3.3), | |||
| so that the multicast membership can be updated promptly. | so that the multicast membership can be updated promptly. | |||
| Otherwise, if there are changes of VMs attachment to NVEs, within | Otherwise, if there are changes of VMs attachment to NVEs, within | |||
| the duration of the configured default time interval that the TSs | the duration of the configured default time interval that the TSs | |||
| routers use for IGMP/MLD queries, multicast data may not reach the | routers use for IGMP/MLD queries, multicast data may not reach the | |||
| VM(s) that moved. | VM(s) that moved. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| 6. Summary | 6. Summary | |||
| This document has identified various mechanisms for supporting | This document has identified various mechanisms for supporting | |||
| application specific multicast in networks that use NVO3. It | application specific multicast in networks that use NVO3. It | |||
| highlights the basics of each mechanism and some of the issues with | highlights the basics of each mechanism and some of the issues with | |||
| them. As solutions are developed, the protocols would need to | them. As solutions are developed, the protocols would need to | |||
| consider the use of these mechanisms and co-existence may be a | consider the use of these mechanisms and co-existence may be a | |||
| consideration. It also highlights some of the requirements for | consideration. It also highlights some of the requirements for | |||
| supporting multicast applications in an NVO3 network. | supporting multicast applications in an NVO3 network. | |||
| skipping to change at page 14, line 29 ¶ | skipping to change at page 14, line 31 ¶ | |||
| 8. IANA Considerations | 8. IANA Considerations | |||
| This document requires no IANA actions. RFC Editor: Please remove | This document requires no IANA actions. RFC Editor: Please remove | |||
| this section before publication. | this section before publication. | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| [RFC3376] Cain B. et al., "Internet Group Management Protocol, | [RFC3376] Cain B. et al. "Internet Group Management Protocol, | |||
| Version 3", October 2002. | Version 3", October 2002. | |||
| [RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs", | [RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs", | |||
| February 2012. | February 2012. | |||
| [RFC7364] Narten, T. et al., "Problem statement: Overlays for | [RFC7364] Narten, T. et al., "Problem statement: Overlays for | |||
| network virtualization", October 2014. | network virtualization", October 2014. | |||
| [RFC7365] Lasserre, M. et al., "Framework for data center (DC) | [RFC7365] Lasserre, M. et al., "Framework for data center (DC) | |||
| network virtualization", October 2014. | network virtualization", October 2014. | |||
| [RFC8014] Narten, T. et al.," An Architecture for Overlay Networks | [RFC8014] Narten, T. et al.," An Architecture for Overlay Networks | |||
| (NVO3)", RFC8014, Dec. 2016. | (NVO3)", RFC8014, Dec. 2016. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| 9.2. Informative References | 9.2. Informative References | |||
| [RFC2710] S. Deering et al, "Multicast Listener Discovery (MLD) for | ||||
| IPv6", Oct 1999. | ||||
| [RFC3569] S. Bhattacharyya, Ed., "An Overview of Source-Specific | [RFC3569] S. Bhattacharyya, Ed., "An Overview of Source-Specific | |||
| Multicast (SSM)", July 2003. | Multicast (SSM)", July 2003. | |||
| [RFC3819] P. Harn et al., "Advice for Internet Subnetwork | [RFC3819] P. Harn et al., "Advice for Internet Subnetwork | |||
| Designers", July 2004. | Designers", July 2004. | |||
| [RFC4762] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private | [RFC4762] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private | |||
| LAN Service (VPLS) using Label Distribution Protocol (LDP) | LAN Service (VPLS) using Label Distribution Protocol (LDP) | |||
| signaling," January 2007. | signaling," January 2007. | |||
| [RFC6831] Farinacci, D. et al., "The Locator/ID Seperation Protocol | [RFC6831] Farinacci, D. et al., "The Locator/ID Seperation Protocol | |||
| (LISP) for Multicast Environments", Jan, 2013. | (LISP) for Multicast Environments", Jan, 2013. | |||
| [RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014. | [RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014. | |||
| [RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area | [RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area | |||
| Network (VXLAN): A Framework for Overlaying Virtualized | Network (VXLAN): A Framework for Overlaying Virtualized | |||
| Layer 2 Networks over Layer 3 Networks", August 2014. | Layer 2 Networks over Layer 3 Networks", August 2014. | |||
| [RFC7365] M. Lasserre, et al. "Framework for Data Center (DC) | ||||
| Network Virtualization", Oct 2014. | ||||
| [RFC7637] Garg P. and Wang, Y. (Eds.), "NVGRE: Network | [RFC7637] Garg P. and Wang, Y. (Eds.), "NVGRE: Network | |||
| Vvirtualization using Generic Routing Encapsulation", | Vvirtualization using Generic Routing Encapsulation", | |||
| September 2015. | September 2015. | |||
| [BIER-ARCH] | [BIER-ARCH] | |||
| Wijnands, IJ. (Ed.) et al., "Multicast using Bit Index | Wijnands, IJ. (Ed.) et al., "Multicast using Bit Index | |||
| Explicit Replication," <draft-ietf-bier-architecture-03>, | Explicit Replication," <draft-ietf-bier-architecture-03>, | |||
| January 2016. | January 2016. | |||
| [DC-MC] McBride, M. and Lui, H., "Multicast in the data center | [DC-MC] McBride, M. and Lui, H., "Multicast in the data center | |||
| overview," <draft-mcbride-armd-mcast-overview-02>, work in | overview," <draft-mcbride-armd-mcast-overview-02>, work in | |||
| progress, July 2012. | progress, July 2012. | |||
| [EDGE-REP] | [EDGE-REP] | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| Marques P. et al., "Edge multicast replication for BGP IP | Marques P. et al., "Edge multicast replication for BGP IP | |||
| VPNs," <draft-marques-l3vpn-mcast-edge-01>, work in | VPNs," <draft-marques-l3vpn-mcast-edge-01>, work in | |||
| progress, June 2012. | progress, June 2012. | |||
| [Geneve] | [Geneve] | |||
| Gross, J. and Ganga, I. (Eds.), "Geneve: Generic Network | Gross, J. and Ganga, I. (Eds.), "Geneve: Generic Network | |||
| Virtualization Encapsulation", <draft-ietf-nvo3-geneve- | Virtualization Encapsulation", <draft-ietf-nvo3-geneve- | |||
| 01>, work in progress, January 2016. | 01>, work in progress, January 2016. | |||
| [GUE] | [GUE] | |||
| skipping to change at page 17, line 5 ¶ | skipping to change at page 17, line 5 ¶ | |||
| in progress, April 2016. | in progress, April 2016. | |||
| 10. Acknowledgments | 10. Acknowledgments | |||
| Many thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, | Many thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, | |||
| Nicolas Bouliane, Saumya Dikshit, Joe Touch, Olufemi Komolafe, and | Nicolas Bouliane, Saumya Dikshit, Joe Touch, Olufemi Komolafe, and | |||
| Matthew Bocci, for their valuable comments and suggestions. | Matthew Bocci, for their valuable comments and suggestions. | |||
| This document was prepared using 2-Word-v2.0.template.dot. | This document was prepared using 2-Word-v2.0.template.dot. | |||
| Internet-Draft A framework for multicast in NVO3 | ||||
| Authors' Addresses | Authors' Addresses | |||
| Anoop Ghanwani | Anoop Ghanwani | |||
| Dell | Dell | |||
| Email: anoop@alumni.duke.edu | Email: anoop@alumni.duke.edu | |||
| Linda Dunbar | Linda Dunbar | |||
| Huawei Technologies | Huawei Technologies | |||
| 5340 Legacy Drive, Suite 1750 | 5340 Legacy Drive, Suite 1750 | |||
| Plano, TX 75024, USA | Plano, TX 75024, USA | |||
| End of changes. 37 change blocks. | ||||
| 47 lines changed or deleted | 93 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||