| < draft-ietf-nvo3-mcast-framework-03.txt | draft-ietf-nvo3-mcast-framework-04.txt > | |||
|---|---|---|---|---|
| skipping to change at page 1, line 15 ¶ | skipping to change at page 1, line 15 ¶ | |||
| Expires: August 14, 2016 M. McBride | Expires: August 14, 2016 M. McBride | |||
| Huawei | Huawei | |||
| V. Bannai | V. Bannai | |||
| R. Krishnan | R. Krishnan | |||
| Dell | Dell | |||
| February 15, 2016 | February 15, 2016 | |||
| A Framework for Multicast in NVO3 | A Framework for Multicast in NVO3 | |||
| draft-ietf-nvo3-mcast-framework-03 | draft-ietf-nvo3-mcast-framework-04 | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. This document may not be modified, | provisions of BCP 78 and BCP 79. This document may not be modified, | |||
| and derivative works of it may not be created, except to publish it | and derivative works of it may not be created, except to publish it | |||
| as an RFC and to translate it into languages other than English. | as an RFC and to translate it into languages other than English. | |||
| skipping to change at page 3, line 33 ¶ | skipping to change at page 3, line 33 ¶ | |||
| The reader is assumed to be familiar with the terminology as defined | The reader is assumed to be familiar with the terminology as defined | |||
| in the NVO3 Framework document [RFC7365] and NVO3 Architecture | in the NVO3 Framework document [RFC7365] and NVO3 Architecture | |||
| document [NVO3-ARCH]. | document [NVO3-ARCH]. | |||
| 1.1. Infrastructure multicast | 1.1. Infrastructure multicast | |||
| Infrastructure multicast includes protocols such as ARP/ND, DHCP, | Infrastructure multicast includes protocols such as ARP/ND, DHCP, | |||
| and mDNS. It is possible to provide solutions for these that do not | and mDNS. It is possible to provide solutions for these that do not | |||
| involve multicast in the underlay network. In the case of ARP/ND, | involve multicast in the underlay network. In the case of ARP/ND, | |||
| an NVA can be used for distributing the mappings of IP address to | an NVA can be used for distributing the mappings of IP address to | |||
| MAC address to all NVEs. The NVEs can then trap ARP Request/ND | MAC address to all NVEs. The NVEs can then trap ARP Request/ND | |||
| Neighbor Solicitation messages from the TSs that are attached to it | Neighbor Solicitation messages from the TSs that are attached to it | |||
| and respond to them, thereby eliminating the need to for | and respond to them, thereby eliminating the need to for | |||
| broadcast/multicast of such messages. In the case of DHCP, the NVE | broadcast/multicast of such messages. In the case of DHCP, the NVE | |||
| can be configured to forward these messages using a helper function. | can be configured to forward these messages using a helper function. | |||
| Of course it is possible to support all of these infrastructure | Of course it is possible to support all of these infrastructure | |||
| multicast protocols natively if the underlay provides multicast | multicast protocols natively if the underlay provides multicast | |||
| transport. However, even in the presence of multicast transport, it | transport. However, even in the presence of multicast transport, it | |||
| may be beneficial to use the optimizations mentioned above to reduce | may be beneficial to use the optimizations mentioned above to reduce | |||
| the amount of such traffic in the network. | the amount of such traffic in the network. | |||
| skipping to change at page 4, line 43 ¶ | skipping to change at page 4, line 43 ¶ | |||
| NVGRE: Network Virtualization using GRE | NVGRE: Network Virtualization using GRE | |||
| SSM: Source-Specific Multicast | SSM: Source-Specific Multicast | |||
| STT: Stateless Tunnel Transport | STT: Stateless Tunnel Transport | |||
| TS: Tenant system | TS: Tenant system | |||
| VM: Virtual Machine | VM: Virtual Machine | |||
| VN: Virtual Network | ||||
| VXLAN: Virtual eXtensible LAN | VXLAN: Virtual eXtensible LAN | |||
| 3. Multicast mechanisms in networks that use NVO3 | 3. Multicast mechanisms in networks that use NVO3 | |||
| In NVO3 environments, traffic between NVEs is transported using an | In NVO3 environments, traffic between NVEs is transported using an | |||
| encapsulation such as VXLAN [VXLAN], NVGRE [RFC7637], STT [STT], | encapsulation such as VXLAN [VXLAN], NVGRE [RFC7637], STT [STT], | |||
| etc. | etc. | |||
| Besides the need to support the Address Resolution Protocol (ARP) | Besides the need to support the Address Resolution Protocol (ARP) | |||
| and Neighbor Discovery (ND), there are several applications that | and Neighbor Discovery (ND), there are several applications that | |||
| skipping to change at page 6, line 32 ¶ | skipping to change at page 6, line 32 ¶ | |||
| service without requiring any specific support from the underlay, | service without requiring any specific support from the underlay, | |||
| other than that of a unicast service. A multicast or broadcast | other than that of a unicast service. A multicast or broadcast | |||
| transmission is achieved by replicating the packet at the source | transmission is achieved by replicating the packet at the source | |||
| NVE, and making copies, one for each destination NVE that the | NVE, and making copies, one for each destination NVE that the | |||
| multicast packet must be sent to. | multicast packet must be sent to. | |||
| For this mechanism to work, the source NVE must know, a priori, the | For this mechanism to work, the source NVE must know, a priori, the | |||
| IP addresses of all destination NVEs that need to receive the | IP addresses of all destination NVEs that need to receive the | |||
| packet. For the purpose of ARP/ND, this would involve knowing the | packet. For the purpose of ARP/ND, this would involve knowing the | |||
| IP addresses of all the NVEs that have Tenant Systems in the virtual | IP addresses of all the NVEs that have Tenant Systems in the virtual | |||
| network instance (VNI) of the Tenant System that generated the | network (VN) of the Tenant System that generated the request. | |||
| request. For the support of application-specific multicast traffic, | For the support of application-specific multicast traffic, | |||
| a method similar to that of receiver-sites registration for a | a method similar to that of receiver-sites registration for a | |||
| particular multicast group described in [LISP-Signal-Free] can be | particular multicast group described in [LISP-Signal-Free] can be | |||
| used. The registrations from different receiver-sites can be merged | used. The registrations from different receiver-sites can be merged | |||
| at the NVA, which can construct a multicast replication-list | at the NVA, which can construct a multicast replication-list | |||
| inclusive of all NVEs to which receivers for a particular multicast | inclusive of all NVEs to which receivers for a particular multicast | |||
| group are attached. The replication-list for each specific multicast | group are attached. The replication-list for each specific multicast | |||
| group is maintained by the NVA. | group is maintained by the NVA. | |||
| The receiver-sites registration is achieved by egress NVEs | The receiver-sites registration is achieved by egress NVEs | |||
| performing the IGMP/MLD snooping to maintain state for which | performing the IGMP/MLD snooping to maintain state for which | |||
| skipping to change at page 7, line 10 ¶ | skipping to change at page 7, line 10 ¶ | |||
| group. When the members of a multicast group are outside the NVO3 | group. When the members of a multicast group are outside the NVO3 | |||
| domain, it is necessary for NVO3 gateways to keep track of the | domain, it is necessary for NVO3 gateways to keep track of the | |||
| remote members of each multicast group. The NVEs and NVO3 gateways | remote members of each multicast group. The NVEs and NVO3 gateways | |||
| then communicate the multicast groups that are of interest to the | then communicate the multicast groups that are of interest to the | |||
| NVA. If the membership is not communicated to the NVA, and if it is | NVA. If the membership is not communicated to the NVA, and if it is | |||
| necessary to prevent hosts attached to an NVE that have not | necessary to prevent hosts attached to an NVE that have not | |||
| subscribed to a multicast group from receiving the multicast | subscribed to a multicast group from receiving the multicast | |||
| traffic, the NVE would need to maintain multicast group membership | traffic, the NVE would need to maintain multicast group membership | |||
| information. | information. | |||
| In multi-homing environments, i.e. in those where a TS is attached | In the absence of IGMP/MLD snooping, the traffic would be delivered | |||
| to all hosts that are part of the VN. | ||||
| In multi-homing environments, i.e., in those where a TS is attached | ||||
| to more than one NVE, the NVA would be expected to provide | to more than one NVE, the NVA would be expected to provide | |||
| information to all of the NVEs under its control about all of the | information to all of the NVEs under its control about all of the | |||
| NVEs to which such a TS is attached. The ingress NVE can choose any | NVEs to which such a TS is attached. The ingress NVE can choose any | |||
| one of the egress NVEs for the data frames destined towards the TS. | one of the egress NVEs for the data frames destined towards the TS. | |||
| In the absence of IGMP/MLD snooping, the traffic would be delivered | ||||
| to all hosts that are part of the VNI. | ||||
| This method requires multiple copies of the same packet to all NVEs | This method requires multiple copies of the same packet to all NVEs | |||
| that participate in the VN. If, for example, a tenant subnet is | that participate in the VN. If, for example, a tenant subnet is | |||
| spread across 50 NVEs, the packet would have to be replicated 50 | spread across 50 NVEs, the packet would have to be replicated 50 | |||
| times at the source NVE. This also creates an issue with the | times at the source NVE. This also creates an issue with the | |||
| forwarding performance of the NVE. | forwarding performance of the NVE. | |||
| Note that this method is similar to what was used in VPLS [RFC4792] | Note that this method is similar to what was used in VPLS [RFC4762] | |||
| prior to support of MPLS multicast [RFC7117]. While there are some | prior to support of MPLS multicast [RFC7117]. While there are some | |||
| similarities between MPLS VPN and the NVO3 overlay, there are some | similarities between MPLS VPN and the NVO3 overlay, there are some | |||
| key differences: | key differences: | |||
| - The CE-to-PE attachment in VPNs is somewhat static, whereas in a | - The CE-to-PE attachment in VPNs is somewhat static, whereas in a | |||
| DC that allows VMs to migrate anywhere, the TS attachment to NVE | DC that allows VMs to migrate anywhere, the TS attachment to NVE | |||
| is much more dynamic. | is much more dynamic. | |||
| - The number of PEs to which a single VPN customer is attached in | - The number of PEs to which a single VPN customer is attached in | |||
| an MPLS VPN environment is normally far less than the number of | an MPLS VPN environment is normally far less than the number of | |||
| NVEs to which a VNI's VMs are attached in a DC. | NVEs to which a VN's VMs are attached in a DC. | |||
| When a VPN customer has multiple multicast groups, [RFC6513] | When a VPN customer has multiple multicast groups, [RFC6513] | |||
| "Multicast VPN" combines all those multicast groups within each | "Multicast VPN" combines all those multicast groups within each | |||
| VPN client to one single multicast group in the MPLS (or VPN) | VPN client to one single multicast group in the MPLS (or VPN) | |||
| core. The result is that messages from any of the multicast | core. The result is that messages from any of the multicast | |||
| groups belonging to one VPN customer will reach all the PE nodes | groups belonging to one VPN customer will reach all the PE nodes | |||
| of the client. In other words, any messages belonging to any | of the client. In other words, any messages belonging to any | |||
| multicast groups under customer X will reach all PEs of the | multicast groups under customer X will reach all PEs of the | |||
| customer X. When the customer X is attached to only a handful of | customer X. When the customer X is attached to only a handful of | |||
| PEs, the use of this approach does not result in excessive wastage | PEs, the use of this approach does not result in excessive wastage | |||
| of bandwidth in the provider's network. | of bandwidth in the provider's network. | |||
| In a DC environment, a typical server/hypervisor based virtual | In a DC environment, a typical server/hypervisor based virtual | |||
| switch may only support 10's VMs (as of this writing). A subnet | switch may only support 10's VMs (as of this writing). A subnet | |||
| with N VMs may be, in the worst case, spread across N vSwitches. | with N VMs may be, in the worst case, spread across N vSwitches. | |||
| Using "MPLS VPN multicast" approach in such a scenario would | Using "MPLS VPN multicast" approach in such a scenario would | |||
| require the creation of a Multicast group in the core for this VNI | require the creation of a Multicast group in the core for this VN | |||
| to reach all N NVEs. If only small percentage of this client's VMs | to reach all N NVEs. If only small percentage of this client's VMs | |||
| participate in application specific multicast, a great number of | participate in application specific multicast, a great number of | |||
| NVEs will receive multicast traffic that is not forwarded to any | NVEs will receive multicast traffic that is not forwarded to any | |||
| of their attached VMs, resulting in considerable wastage of | of their attached VMs, resulting in considerable wastage of | |||
| bandwidth. | bandwidth. | |||
| Therefore, the Multicast VPN solution may not scale in DC | Therefore, the Multicast VPN solution may not scale in DC | |||
| environment with dynamic attachment of Virtual Networks to NVEs and | environment with dynamic attachment of Virtual Networks to NVEs and | |||
| greater number of NVEs for each virtual network. | greater number of NVEs for each virtual network. | |||
| skipping to change at page 9, line 14 ¶ | skipping to change at page 9, line 14 ¶ | |||
| - The MSN can obtain the membership information from the NVEs that | - The MSN can obtain the membership information from the NVEs that | |||
| snoop the IGMP/MLD messages. This can be done by having the MSN | snoop the IGMP/MLD messages. This can be done by having the MSN | |||
| communicate with the NVEs, or by having the NVA obtain the | communicate with the NVEs, or by having the NVA obtain the | |||
| information from the NVEs, and in turn have MSN communicate with | information from the NVEs, and in turn have MSN communicate with | |||
| the NVA. | the NVA. | |||
| Unlike the method described in Section 3.2, there is no performance | Unlike the method described in Section 3.2, there is no performance | |||
| impact at the ingress NVE, nor are there any issues with multiple | impact at the ingress NVE, nor are there any issues with multiple | |||
| copies of the same packet from the source NVE to the multicast | copies of the same packet from the source NVE to the multicast | |||
| service node. However there remain issues with multiple copies of | service node. However, there remain issues with multiple copies of | |||
| the same packet on links that are common to the paths from the MSN | the same packet on links that are common to the paths from the MSN | |||
| to each of the egress NVEs. Additional issues that are introduced | to each of the egress NVEs. Additional issues that are introduced | |||
| with this method include the availability of the MSN, methods to | with this method include the availability of the MSN, methods to | |||
| scale the services offered by the MSN, and the sub-optimality of the | scale the services offered by the MSN, and the sub-optimality of the | |||
| delivery paths. | delivery paths. | |||
| Finally, the IP address of the source NVE must be preserved in | Finally, the IP address of the source NVE must be preserved in | |||
| packet copies created at the multicast service node if data plane | packet copies created at the multicast service node if data plane | |||
| learning is in use. This could create problems if IP source address | learning is in use. This could create problems if IP source address | |||
| reverse path forwarding (RPF) checks are in use. | reverse path forwarding (RPF) checks are in use. | |||
| skipping to change at page 9, line 39 ¶ | skipping to change at page 9, line 39 ¶ | |||
| NVE encapsulates the packet with the appropriate IP multicast | NVE encapsulates the packet with the appropriate IP multicast | |||
| address in the tunnel encapsulation header for delivery to the | address in the tunnel encapsulation header for delivery to the | |||
| desired set of NVEs. The protocol in the underlay could be any | desired set of NVEs. The protocol in the underlay could be any | |||
| variant of Protocol Independent Multicast (PIM), or protocol | variant of Protocol Independent Multicast (PIM), or protocol | |||
| dependent multicast, such as [ISIS-Multicast]. | dependent multicast, such as [ISIS-Multicast]. | |||
| If an NVE connects to its attached TSs via Layer 2 network, there | If an NVE connects to its attached TSs via Layer 2 network, there | |||
| are multiple ways for NVEs to support the application specific | are multiple ways for NVEs to support the application specific | |||
| multicast: | multicast: | |||
| - The NVE only supports the basic IGMP/MLD snooping function, let | - The NVE only supports the basic IGMP/MLD snooping function, let | |||
| the TSs routers handling the application specific multicast. This | the TSs routers handling the application specific multicast. This | |||
| scheme doesn't utilize the underlay IP multicast protocols. | scheme doesn't utilize the underlay IP multicast protocols. | |||
| - The NVE can act as a pseudo multicast router for the directly | - The NVE can act as a pseudo multicast router for the directly | |||
| attached VMs and support proper mapping of IGMP/MLD's messages to | attached VMs and support proper mapping of IGMP/MLD's messages to | |||
| the messages needed by the underlay IP multicast protocols. | the messages needed by the underlay IP multicast protocols. | |||
| With this method, there are none of the issues with the methods | With this method, there are none of the issues with the methods | |||
| described in Sections 3.2. | described in Sections 3.2. | |||
| With PIM Sparse Mode (PIM-SM), the number of flows required would be | With PIM Sparse Mode (PIM-SM), the number of flows required would be | |||
| (n*g), where n is the number of source NVEs that source packets for | (n*g), where n is the number of source NVEs that source packets for | |||
| the group, and g is the number of groups. Bidirectional PIM (BIDIR- | the group, and g is the number of groups. Bidirectional PIM (BIDIR- | |||
| PIM) would offer better scalability with the number of flows | PIM) would offer better scalability with the number of flows | |||
| skipping to change at page 13, line 10 ¶ | skipping to change at page 13, line 10 ¶ | |||
| application specific multicast in networks that use NVO3. It | application specific multicast in networks that use NVO3. It | |||
| highlights the basics of each mechanism and some of the issues with | highlights the basics of each mechanism and some of the issues with | |||
| them. As solutions are developed, the protocols would need to | them. As solutions are developed, the protocols would need to | |||
| consider the use of these mechanisms and co-existence may be a | consider the use of these mechanisms and co-existence may be a | |||
| consideration. It also highlights some of the requirements for | consideration. It also highlights some of the requirements for | |||
| supporting multicast applications in an NVO3 network. | supporting multicast applications in an NVO3 network. | |||
| 7. Security Considerations | 7. Security Considerations | |||
| This draft does not introduce any new security considerations beyond | This draft does not introduce any new security considerations beyond | |||
| what may be present in proposed solutions | what may be present in proposed solutions. | |||
| 8. IANA Considerations | 8. IANA Considerations | |||
| This document requires no IANA actions. RFC Editor: Please remove | This document requires no IANA actions. RFC Editor: Please remove | |||
| this section before publication. | this section before publication. | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| skipping to change at page 14, line 15 ¶ | skipping to change at page 14, line 15 ¶ | |||
| [STT] Davie, B. and Gross, J., "A stateless transport tunneling | [STT] Davie, B. and Gross, J., "A stateless transport tunneling | |||
| protocol for network virtualization," work in progress. | protocol for network virtualization," work in progress. | |||
| [DC-MC] McBride, M. and Lui, H., "Multicast in the data center | [DC-MC] McBride, M. and Lui, H., "Multicast in the data center | |||
| overview," work in progress. | overview," work in progress. | |||
| [ISIS-Multicast] | [ISIS-Multicast] | |||
| Yong, L. et al., "ISIS Protocol Extension for Building | Yong, L. et al., "ISIS Protocol Extension for Building | |||
| Distribution Trees", work in progress. | Distribution Trees", work in progress. | |||
| [RFC4792] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private | [RFC4762] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private | |||
| LAN Service (VPLS) using Label Distribution Protocol (LDP) | LAN Service (VPLS) using Label Distribution Protocol (LDP) | |||
| signaling," RFC 4762, January 2007. | signaling," January 2007. | |||
| [RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014. | [RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014. | |||
| [LANE] "LAN emulation over ATM," The ATM Forum, af-lane-0021.000, | [LANE] "LAN emulation over ATM," The ATM Forum, af-lane-0021.000, | |||
| January 1995. | January 1995. | |||
| [EDGE-REP] | [EDGE-REP] | |||
| Marques P. et al., "Edge multicast replication for BGP IP | Marques P. et al., "Edge multicast replication for BGP IP | |||
| VPNs," work in progress.. | VPNs," work in progress.. | |||
| End of changes. 16 change blocks. | ||||
| 18 lines changed or deleted | 20 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||