ARMD BOF: Date: 11/12/10 (Friday) Time: 9:30am - 12:30am Discussion during Presentations: 1. Mark Pearson (HP) There are three issues around the ARP issue: • One around ARP issues and data; • second regarding Layer data plane interface machines in virtual machines, and • third related to virtual data center and migration. Leon Patey: The problems is that VM are Migrating and non-migrating. There may be too many hypervisors. (cisco) : Is this a problem for the hypervisors or the IETF? The problem is when the VM move. (Leon): They are in control. They can handle the ARPs when they move. (Linda): We are just trying to get the ARPs to handle. cisco): IP aware. You are discussing the IP redirect. In that that case, they behave as layer 3 devices. The moment they are looking at ICMP redirect. They are layer -2 from Top Rack up, and Layer 3 down. They have TOP of rack switches. You can direct redirect. (cisco) : The TOR is in the broadcast domain. [Himanshu]In one of your slides, they do not announce themselves. Is this a hide-and-seek.? No can find them. [Linda]: Most people announcement them all the time. [Himanshu]: If they move, and they announcement. [Rahul]: Scope of the work should interact with the Network. The API is between the VM and the network. This has a direct scope of the work with the interaction and extending the working group. The solution(s) do depend on the location of the default router. [Jin-Shin-WE]: The VM and move have security issues. The Security mechanism should be included. ____(cisco?): Another clarification on the VLANs, if you come are the head of the VLANS. You do not see the VLANs with 10K ports. There are 10K ports for 50 VLANs for the same interface. I will receive the VLANs. [Linda]: in the past the VLAN servers, all the broadcast messages will not go through the vlans. When, we have the VLANS going through the server. [___(cisco)?]: What should happen – why is this a problem? [Linda] The servers see a lot of information. They go a great deal of ARP message. [Ralph]: Part of the BOFs, why would a particular server get so many ARP message when there. 100 hosts – not increasing. Not guaranteed – may cut of cross VLANs. It was VLANs are distributed. [Jia ____]: Does this belong in the peer-to peer working group? [editor: ?is this correct] [Linda] We are aware peer working Group, and TRILL work. [Malacom]: The TRILL working group has this level of scope outside the world. IPv6 presentation Discussion: [Brian]: People building process that you can put 2**64 host, and vastly host on the v6. This will generate multicast generating the standards stuff. The stuff will go on the multicast stuff. [Bob] We decided to allow many numbers of subnet. I think it can go much bigger than 2**64. In IPv6 same IEEE devices on the same subnet. It would be appropriate to see how many. The solicited uses the same multicast group as protocol. It would see that the multicast switches do multicast switches. That the switches you, should have. [Jari]: The hosts are not receiving the packet. Is the ARP seen on the IGMP snooping? [Brian] There is no limit of virtual hosts – at SLACC and DAD? It could be large. [Samung] This looks very good for scalable. Have you look at draft-low-pan -6id? Some of the draft multicast issues are being looked at. The multicast issues are being addresses. [Xiahou] I suggest you should leave the IPv6 aside. We could look at the IPv4 address. [Ross Callon]: I understand the focus on the IPv4 focus. We want the world to go v6. V6 has a chance to do something better than v4. It is better that we focus on v4. Ralph Questions ------------------ Ralph Droms 1. I looked for a wireshark that has traces of ARPs. It is important to know how much occurs with and without VM. Rahual: The meta comment is that we do not know the Actors. Are the actors the VM, the network, or the L2-L3? Himanshu: The assumptions is simple: a) gratuitous arp, b) simple solution, and c) deployment. Santil (The simple problem is the large L2 broadcast domain versus creating subnets. Summary of the ARMD BoF discussion after presentations ------------------------------------------------------ Bob Hinden: It seems like a problem of large broadcast domain - well known problem, that's why we use IP subnetting. Seems virtualization industry hasn't got that. Not sure there's really a problem here. Regarding IPv6, I suspect that while trying to change/extend ARP the IPv4 window has passed and by the time you get it done it may not really matter anymore. I recommend analyzing the problem for IPv6, I think for IPv4 it's too late and we won't turn things around in 3 weeks Brian Carpenter: I second Bob’s comment. In addition, I’ve seen attempt at interconnecting datacenter over 15000 km, it was horrible. Don't do that, don't try to rebuild the sort of network that CERN, Boeing, or Microsoft were running in the late 90's with a massive number of hosts in the same subnet. The fact that this is in a rack doesn't change that. Claudio DeSanti: A fundamental problem is scaling of larger broadcast domains, current technologies for virtualization makes this as a realistic scenario, but now with the current VM technology it is really possible to create layer 2 domains that are large, this is the real problem here. Don't see VM movement as a problem per se, it doesn't happen that frequently, you want datacenter to do real things for you, not keep moving VMs around. The fundamental reason for move is VM failure, you don't have one such every second. Granularity in the order of minutes... So VM movement is not a problem. Scalability of broadcast domain with large number is the problem. Agree that IPv6 should be analyzed first. Jari Arkko: Too big a mistake for BoFs – we used too much time talking about solutions. Let's focus on the issue. Comment on IPv4/IPv6 stuff: I think it's possible there's a problem for the IPv4 side, and we might debate whether to solve that. For IPv6 it seems we do not have a problem. What I am not sure about is the VM migration, that part might not be entirely solved, analysis has to be done. So if you move to IPv6, your ARP problems disappear. Linda Dunbar: It's not just about large broadcast domains. Broadcast domain might be small, but many small broadcast domains on TOR switch. In the end you get the same massive ARP load. Suresh Krishnan: Agree on focusing on IPv6. Also regarding the IPv6 flooding issue, not sure what the problem is: The host will not be bothered because Neighbor Solicitations are sent to solicited-node multicast address which encodes the low order 24 bits of the IPv6 address to be resolved. Ralph Droms: Follow up on Jari’s comment. First step is to decide on what the problem is. Still needs to answer the question: is there's a problem, how big it is, etc. Seems like there are many problems that pop up in different places as we look at it. I think it is right that it is not strictly a large broadcast domain issue, it's also combination of smaller broadcast with large number of VLANS. I was hoping on to see in summary some wireshark traces to help us characterize the ARP traffic in various parts of datacenter. Rahul Aggarwal: ARP doesn't existt in isolation , there is L2/L3, network, ..., you rewrite the application. What we need to do is to get a set of operators and figre. OPS area - step back.... Charter: All solution should not reuqire any behavior change on host, applications and Virtual Machines. This is very limiting, we shold not rule out hypervusor framework work, e.g. , network part of hypervisor. Himanshu Shah: Answer the question: yes we should work on solution, yes this is a problem. Specific behavior of VM shouldn't be ruled out. Claudio DeSanti: Back on large broadcast domains vs. number of systems. Problem is number of ARP request, it's not the problem of VLAN / subnets etc., it's the number of hosts. Linda Dunbar: Agrees. Unknown: It's unclear what is the scale we're talking about. Scale: each cluster, 10000 physical hosts, 8 cores, 4 VMs per core. The only place where I've seen ARP as a problem wasn't virtualized domain. Even without going to that scale, even without VM migration, people have seen problems. Some of them are bugs, even if they are fixed there is a learning problem on switches. You'd be surprised about mobility granularity. The fact that learning is by flooding creates problems with VM movement. Everybody says ND scales, but I haven’t seen anyone that actually tried it. Also I am not sure we need protocol change, but we need solution. I don't understand if this is in-scope or out-scope for IETF. Ross Callon: We need to figure out the problem, and quickly, there is talk to start datacenter OPS WG in IETF. That's a very urgent effort. It is not clear whether we need new protocol. It also seems there are different ways of virtualizing w.r.t. to network. There's an issue. Not obvious we need new WG, but obvious we need to look at the issue. This talk has made me understand better the urgency for a group to look at operation of datacenter. Not sure we need a WG to look at ARP. It’s a high priority effort. George ? (Cisco): Sitting as uninformed observer, I have seen lot of confusion as to what problem needs to be solved, jumping at solution while problem still not clear isn’t good. Claudio DeSanti: Confusion of problems, we're working on perception, e.g. flooding is bad, but actually flooding is very normal, that's how you do learning of where something is located in datacenter. If we decide to so some work we need to clarify the problems. ========== Question for Humms: #1 - Working group should be charted now. #2 - work is interesting, but further definition of WG needs more work. #3 - work should be abandoned Order of the hums: 1. #2, 2. #1, and 3. #3.