| < draft-ietf-nvo3-mcast-framework-02.txt | draft-ietf-nvo3-mcast-framework-03.txt > | |||
|---|---|---|---|---|
| NVO3 working group A. Ghanwani | NVO3 working group A. Ghanwani | |||
| Internet Draft Dell | Internet Draft Dell | |||
| Intended status: Informational L. Dunbar | Intended status: Informational L. Dunbar | |||
| Expires: August 9, 2016 M. McBride | Expires: August 14, 2016 M. McBride | |||
| Huawei | Huawei | |||
| V. Bannai | V. Bannai | |||
| R. Krishnan | R. Krishnan | |||
| Dell | Dell | |||
| February 10, 2016 | February 15, 2016 | |||
| A Framework for Multicast in NVO3 | A Framework for Multicast in NVO3 | |||
| draft-ietf-nvo3-mcast-framework-02 | draft-ietf-nvo3-mcast-framework-03 | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. This document may not be modified, | provisions of BCP 78 and BCP 79. This document may not be modified, | |||
| and derivative works of it may not be created, except to publish it | and derivative works of it may not be created, except to publish it | |||
| as an RFC and to translate it into languages other than English. | as an RFC and to translate it into languages other than English. | |||
| skipping to change at page 1, line 43 ¶ | skipping to change at page 1, line 43 ¶ | |||
| months and may be updated, replaced, or obsoleted by other documents | months and may be updated, replaced, or obsoleted by other documents | |||
| at any time. It is inappropriate to use Internet-Drafts as | at any time. It is inappropriate to use Internet-Drafts as | |||
| reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html | http://www.ietf.org/shadow.html | |||
| This Internet-Draft will expire on August 9 2016. | This Internet-Draft will expire on August 14, 2016. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2016 IETF Trust and the persons identified as the | Copyright (c) 2016 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| skipping to change at page 3, line 33 ¶ | skipping to change at page 3, line 33 ¶ | |||
| The reader is assumed to be familiar with the terminology as defined | The reader is assumed to be familiar with the terminology as defined | |||
| in the NVO3 Framework document [RFC7365] and NVO3 Architecture | in the NVO3 Framework document [RFC7365] and NVO3 Architecture | |||
| document [NVO3-ARCH]. | document [NVO3-ARCH]. | |||
| 1.1. Infrastructure multicast | 1.1. Infrastructure multicast | |||
| Infrastructure multicast includes protocols such as ARP/ND, DHCP, | Infrastructure multicast includes protocols such as ARP/ND, DHCP, | |||
| and mDNS. It is possible to provide solutions for these that do not | and mDNS. It is possible to provide solutions for these that do not | |||
| involve multicast in the underlay network. In the case of ARP/ND, | involve multicast in the underlay network. In the case of ARP/ND, | |||
| an NVA can be used for distributing the mappings of IP address to | an NVA can be used for distributing the mappings of IP address to | |||
| MAC address to all NVEs, and the NVEs can respond to ARP messages | MAC address to all NVEs. The NVEs can then trap ARP Request/ND | |||
| from the TSs that are attached to it in a way that is similar to | Neighbor Solicitation messages from the TSs that are attached to it | |||
| proxy-ARP. In the case of DHCP, the NVE can be configured to | and respond to them, thereby eliminating the need to for | |||
| forward these messages using a helper function. | broadcast/multicast of such messages. In the case of DHCP, the NVE | |||
| can be configured to forward these messages using a helper function. | ||||
| Of course it is possible to support all of these infrastructure | Of course it is possible to support all of these infrastructure | |||
| multicast protocols natively if the underlay provides multicast | multicast protocols natively if the underlay provides multicast | |||
| transport. However, even in the presence of multicast transport, it | transport. However, even in the presence of multicast transport, it | |||
| may be beneficial to use the optimizations mentioned above to reduce | may be beneficial to use the optimizations mentioned above to reduce | |||
| the amount of such traffic in the network. | the amount of such traffic in the network. | |||
| 1.2. Application-specific multicast | 1.2. Application-specific multicast | |||
| Application-specific multicast traffic, which may be either Source- | Application-specific multicast traffic, which may be either Source- | |||
| Specific Multicast (SSM) or Any-Source Multicast (ASM)[RFC3569], | Specific Multicast (SSM) or Any-Source Multicast (ASM)[RFC 3569], | |||
| has the following characteristics: | has the following characteristics: | |||
| 1. Receiver hosts are expected to subscribe to multicast content | 1. Receiver hosts are expected to subscribe to multicast content | |||
| using protocols such as IGMP [RFC3376] (IPv4) or MLD (IPv6). | using protocols such as IGMP [RFC3376] (IPv4) or MLD (IPv6). | |||
| Multicast sources and listeners participant in these protocols | Multicast sources and listeners participant in these protocols | |||
| using addresses that are in the Tenant System address domain. | using addresses that are in the Tenant System address domain. | |||
| 2. The list of multicast listeners for each multicast group is not | 2. The list of multicast listeners for each multicast group is not | |||
| known in advance. Therefore, it may not be possible for an NVA | known in advance. Therefore, it may not be possible for an NVA | |||
| to get the list of participants for each multicast group ahead | to get the list of participants for each multicast group ahead | |||
| skipping to change at page 7, line 10 ¶ | skipping to change at page 7, line 10 ¶ | |||
| group. When the members of a multicast group are outside the NVO3 | group. When the members of a multicast group are outside the NVO3 | |||
| domain, it is necessary for NVO3 gateways to keep track of the | domain, it is necessary for NVO3 gateways to keep track of the | |||
| remote members of each multicast group. The NVEs and NVO3 gateways | remote members of each multicast group. The NVEs and NVO3 gateways | |||
| then communicate the multicast groups that are of interest to the | then communicate the multicast groups that are of interest to the | |||
| NVA. If the membership is not communicated to the NVA, and if it is | NVA. If the membership is not communicated to the NVA, and if it is | |||
| necessary to prevent hosts attached to an NVE that have not | necessary to prevent hosts attached to an NVE that have not | |||
| subscribed to a multicast group from receiving the multicast | subscribed to a multicast group from receiving the multicast | |||
| traffic, the NVE would need to maintain multicast group membership | traffic, the NVE would need to maintain multicast group membership | |||
| information. | information. | |||
| In multi-homing environments, i.e. more than one NVE can reach a | In multi-homing environments, i.e. in those where a TS is attached | |||
| specific TS, the NVA would be expected to provide all the NVEs that | to more than one NVE, the NVA would be expected to provide | |||
| can reach the given TS. The ingress NVE can choose any one of the | information to all of the NVEs under its control about all of the | |||
| egress NVEs for the data frames destined towards the TS. | NVEs to which such a TS is attached. The ingress NVE can choose any | |||
| one of the egress NVEs for the data frames destined towards the TS. | ||||
| In the absence of IGMP/MLD snooping, the traffic would be delivered | In the absence of IGMP/MLD snooping, the traffic would be delivered | |||
| to all hosts that are part of the VNI. | to all hosts that are part of the VNI. | |||
| This method requires multiple copies of the same packet to all NVEs | This method requires multiple copies of the same packet to all NVEs | |||
| that participate in the VN. If, for example, a tenant subnet is | that participate in the VN. If, for example, a tenant subnet is | |||
| spread across 50 NVEs, the packet would have to be replicated 50 | spread across 50 NVEs, the packet would have to be replicated 50 | |||
| times at the source NVE. This also creates an issue with the | times at the source NVE. This also creates an issue with the | |||
| forwarding performance of the NVE. | forwarding performance of the NVE. | |||
| Note that this method is similar to what was used in VPLS [RFC4762] | Note that this method is similar to what was used in VPLS [RFC4792] | |||
| prior to support of MPLS multicast [RFC7117]. While there are some | prior to support of MPLS multicast [RFC7117]. While there are some | |||
| similarities between MPLS VPN and the NVO3 overlay, there are some | similarities between MPLS VPN and the NVO3 overlay, there are some | |||
| key differences: | key differences: | |||
| - The CE-to-PE attachment in VPNs is somewhat static, whereas in a | - The CE-to-PE attachment in VPNs is somewhat static, whereas in a | |||
| DC that allows VMs to migrate anywhere, the TS attachment to NVE | DC that allows VMs to migrate anywhere, the TS attachment to NVE | |||
| is much more dynamic. | is much more dynamic. | |||
| - The number of PEs to which a single VPN customer is attached in | - The number of PEs to which a single VPN customer is attached in | |||
| an MPLS VPN environment is normally far less than the number of | an MPLS VPN environment is normally far less than the number of | |||
| NVEs to which a VNI's VMs are attached in a DC. | NVEs to which a VNI's VMs are attached in a DC. | |||
| When a VPN customer has multiple multicast groups, [RFC6513] | When a VPN customer has multiple multicast groups, [RFC6513] | |||
| "Multicast VPN" combines all those multicast groups within each | "Multicast VPN" combines all those multicast groups within each | |||
| VPN client to one single multicast group in the MPLS (or VPN) | VPN client to one single multicast group in the MPLS (or VPN) | |||
| core. The result is that messages from any of the multicast | core. The result is that messages from any of the multicast | |||
| groups belonging to one VPN customer will reach all the PE nodes | groups belonging to one VPN customer will reach all the PE nodes | |||
| of the client. In other words, any messages belonging to any | of the client. In other words, any messages belonging to any | |||
| multicast groups under customer X will reach all PEs of the | multicast groups under customer X will reach all PEs of the | |||
| customer X. When the customer X is attached to only a handful of | customer X. When the customer X is attached to only a handful of | |||
| PEs, the use of this approach does not result in excessive wastage | PEs, the use of this approach does not result in excessive wastage | |||
| of bandwidth in the provider's network. | of bandwidth in the provider's network. | |||
| In a DC environment, a typical server/hypervisor based virtual | In a DC environment, a typical server/hypervisor based virtual | |||
| switch may only support 10's VMs (as of this writing). A subnet | switch may only support 10's VMs (as of this writing). A subnet | |||
| with N VMs may be, in the worst case, spread across N vSwitches. | with N VMs may be, in the worst case, spread across N vSwitches. | |||
| Using "MPLS VPN multicast" approach in such a scenario would | Using "MPLS VPN multicast" approach in such a scenario would | |||
| require the creation of a Multicast group in the core for this VNI | require the creation of a Multicast group in the core for this VNI | |||
| to reach all N NVEs. If only small percentage of this client's VMs | to reach all N NVEs. If only small percentage of this client's VMs | |||
| participate in application specific multicast, a great number of | participate in application specific multicast, a great number of | |||
| NVEs will receive multicast traffic that is not forwarded to any | NVEs will receive multicast traffic that is not forwarded to any | |||
| of their attached VMs, resulting in considerable wastage of | of their attached VMs, resulting in considerable wastage of | |||
| bandwidth. | bandwidth. | |||
| Therefore, the Multicast VPN solution may not scale in DC | Therefore, the Multicast VPN solution may not scale in DC | |||
| environment with dynamic attachment of Virtual Networks to NVEs and | environment with dynamic attachment of Virtual Networks to NVEs and | |||
| greater number of NVEs for each virtual network. | greater number of NVEs for each virtual network. | |||
| 3.3. Replication at a multicast service node | 3.3. Replication at a multicast service node | |||
| With this method, all multicast packets would be sent using a | With this method, all multicast packets would be sent using a | |||
| unicast tunnel encapsulation from the ingress NVE to a multicast | unicast tunnel encapsulation from the ingress NVE to a multicast | |||
| service node (MSN). The MSN, in turn, would create multiple copies | service node (MSN). The MSN, in turn, would create multiple copies | |||
| skipping to change at page 9, line 39 ¶ | skipping to change at page 9, line 39 ¶ | |||
| NVE encapsulates the packet with the appropriate IP multicast | NVE encapsulates the packet with the appropriate IP multicast | |||
| address in the tunnel encapsulation header for delivery to the | address in the tunnel encapsulation header for delivery to the | |||
| desired set of NVEs. The protocol in the underlay could be any | desired set of NVEs. The protocol in the underlay could be any | |||
| variant of Protocol Independent Multicast (PIM), or protocol | variant of Protocol Independent Multicast (PIM), or protocol | |||
| dependent multicast, such as [ISIS-Multicast]. | dependent multicast, such as [ISIS-Multicast]. | |||
| If an NVE connects to its attached TSs via Layer 2 network, there | If an NVE connects to its attached TSs via Layer 2 network, there | |||
| are multiple ways for NVEs to support the application specific | are multiple ways for NVEs to support the application specific | |||
| multicast: | multicast: | |||
| - The NVE only supports the basic IGMP/MLD snooping function, let | - The NVE only supports the basic IGMP/MLD snooping function, let | |||
| the TSs routers handling the application specific multicast. This | the TSs routers handling the application specific multicast. This | |||
| scheme doesn't utilize the underlay IP multicast protocols. | scheme doesn't utilize the underlay IP multicast protocols. | |||
| - The NVE can act as a pseudo multicast router for the directly | - The NVE can act as a pseudo multicast router for the directly | |||
| attached VMs and support proper mapping of IGMP/MLD's messages to | attached VMs and support proper mapping of IGMP/MLD's messages to | |||
| the messages needed by the underlay IP multicast protocols. | the messages needed by the underlay IP multicast protocols. | |||
| With this method, there are none of the issues with the methods | With this method, there are none of the issues with the methods | |||
| described in Sections 3.2. | described in Sections 3.2. | |||
| With PIM Sparse Mode (PIM-SM), the number of flows required would be | With PIM Sparse Mode (PIM-SM), the number of flows required would be | |||
| (n*g), where n is the number of source NVEs that source packets for | (n*g), where n is the number of source NVEs that source packets for | |||
| the group, and g is the number of groups. Bidirectional PIM (BIDIR- | the group, and g is the number of groups. Bidirectional PIM (BIDIR- | |||
| PIM) would offer better scalability with the number of flows | PIM) would offer better scalability with the number of flows | |||
| skipping to change at page 13, line 27 ¶ | skipping to change at page 13, line 27 ¶ | |||
| 9. References | 9. References | |||
| 9.1. Normative References | 9.1. Normative References | |||
| [RFC7365] Lasserre, M. et al., "Framework for data center (DC) | [RFC7365] Lasserre, M. et al., "Framework for data center (DC) | |||
| network virtualization", October 2014. | network virtualization", October 2014. | |||
| [RFC7364] Narten, T. et al., "Problem statement: Overlays for | [RFC7364] Narten, T. et al., "Problem statement: Overlays for | |||
| network virtualization", October 2014. | network virtualization", October 2014. | |||
| [NVO3-ARCH] Narten, T. et al.," An Architecture for Overlay Networks | [NVO3-ARCH] | |||
| (NVO3)", work in progress, February 2014. | Narten, T. et al.," An Architecture for Overlay Networks | |||
| (NVO3)", work in progress. | ||||
| [RFC3376] Cain B. et al., "Internet Group Management Protocol, | [RFC3376] Cain B. et al., "Internet Group Management Protocol, | |||
| Version 3", October 2002. | Version 3", October 2002. | |||
| [RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs", | [RFC6513] Rosen, E. et al., "Multicast in MPLS/BGP IP VPNs", | |||
| February 2012. | February 2012. | |||
| 9.2. Informative References | 9.2. Informative References | |||
| [RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area | [RFC7348] Mahalingam, M. et al., " Virtual eXtensible Local Area | |||
| Network (VXLAN): A Framework for Overlaying Virtualized | Network (VXLAN): A Framework for Overlaying Virtualized | |||
| Layer 2 Networks over Layer 3 Networks", August 2014. | Layer 2 Networks over Layer 3 Networks", August 2014. | |||
| [RFC7637] Garg P. and Wang, Y. (Eds.), "NVGRE: Network | [RFC7637] Garg, P. and Wang, Y. (Eds.), "NVGRE: Network | |||
| Virtualization using Generic Routing Encapsulation", | Vvirtualization using Generic Routing Encapsulation", | |||
| September 2015. | September 2015. | |||
| [STT] Davie, B. and Gross, J., "A stateless transport tunneling | [STT] Davie, B. and Gross, J., "A stateless transport tunneling | |||
| protocol for network virtualization," work in progress. | protocol for network virtualization," work in progress. | |||
| [DC-MC] McBride M., and Lui, H., "Multicast in the data center | [DC-MC] McBride, M. and Lui, H., "Multicast in the data center | |||
| overview," work in progress. | overview," work in progress. | |||
| [ISIS-Multicast] | [ISIS-Multicast] | |||
| L. Yong, et al., "ISIS Protocol Extension for Building | Yong, L. et al., "ISIS Protocol Extension for Building | |||
| Distribution Trees", work in progress. | Distribution Trees", work in progress. | |||
| [RFC4762] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private | [RFC4792] Lasserre, M., and Kompella, V. (Eds.), "Virtual Private | |||
| LAN Service (VPLS) using Label Distribution Protocol (LDP) | LAN Service (VPLS) using Label Distribution Protocol (LDP) | |||
| signaling," January 2007. | signaling," RFC 4762, January 2007. | |||
| [RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014. | [RFC7117] Aggarwal, R. et al., "Multicast in VPLS," February 2014. | |||
| [LANE] "LAN emulation over ATM," The ATM Forum, af-lane-0021.000, | [LANE] "LAN emulation over ATM," The ATM Forum, af-lane-0021.000, | |||
| January 1995. | January 1995. | |||
| [EDGE-REP] | [EDGE-REP] | |||
| Marques P. et al., "Edge multicast replication for BGP IP | Marques P. et al., "Edge multicast replication for BGP IP | |||
| VPNs," work in progress.. | VPNs," work in progress.. | |||
| [RFC3569] S. Bhattacharyya, Ed., "An Overview of Source-Specific | [RFC 3569] | |||
| S. Bhattacharyya, Ed., "An Overview of Source-Specific | ||||
| Multicast (SSM)", July 2003. | Multicast (SSM)", July 2003. | |||
| [LISP-Signal-Free] | [LISP-Signal-Free] | |||
| Moreno, V. and Farinacci, D., "Signal-Free LISP | Moreno, V. and Farinacci, D., "Signal-Free LISP | |||
| Multicast", work in progress. | Multicast", work in progress. | |||
| 10. Acknowledgments | 10. Acknowledgments | |||
| Thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, and | Thanks are due to Dino Farinacci, Erik Nordmark, Lucy Yong, Nicolas | |||
| Nicolas Bouliane, for their comments and suggestions. | Bouliane, and Saumya Dikshit for their comments and suggestions. | |||
| This document was prepared using 2-Word-v2.0.template.dot. | This document was prepared using 2-Word-v2.0.template.dot. | |||
| Authors' Addresses | Authors' Addresses | |||
| Anoop Ghanwani | Anoop Ghanwani | |||
| Dell | Dell | |||
| Email: anoop@alumni.duke.edu | Email: anoop@alumni.duke.edu | |||
| Linda Dunbar | Linda Dunbar | |||
| Huawei Technologies | Huawei Technologies | |||
| 5340 Legacy Drive, Suite 1750 | 5340 Legacy Drive, Suite 1750 | |||
| Plano, TX 75024, USA | Plano, TX 75024, USA | |||
| Phone: (469) 277 5840 | Phone: (469) 277 5840 | |||
| Email: ldunbar@huawei.com | Email: ldunbar@huawei.com | |||
| Mike McBride | Mike McBride | |||
| Huawei Technologies | Huawei Technologies | |||
| mmcbride7@gmail.com | Email: mmcbride7@gmail.com | |||
| Vinay Bannai | Vinay Bannai | |||
| Email: vbannai@gmail.com | Email: vbannai@gmail.com | |||
| Ram Krishnan | Ram Krishnan | |||
| Dell | Dell | |||
| Email: ramkri123@gmail.com | Email: ramkri123@gmail.com | |||
| End of changes. 25 change blocks. | ||||
| 55 lines changed or deleted | 59 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||