TRILL Working Group Meeting Minutes Monday, March 28, 2011, 13:00-15:00, Vienna/Madrid Room, Prague Hilton Chairs: Erik Nordmark (Cisco), Donald Eastlake (Huawei) Notes by Jon Hudson and Radia Perlman, edited by the Chairs. Blue Sheets distributed. Attention drawn to "Note Well" notice. Jon Hudson (Brocade) volunteered to be scribe. No one appears to be on Jabber and no one volunteered to be Jabber scribe. A new Charter is in place and on online – See the TRILL WG website for details: http://datatracker.ietf.org/wg/trill/charter/ Review major items called out in the Charter. David Allen (Ericsson): About the agenda, I understand there was a liaison received recently from the IEEE. Will that be discussed? Erik Nordmark (Cisco, TRILL Co-Chair): That liaison was actually to the IESG. Ralph Droms (Cisco, TRILL WG AD): Right. The IESG sent a liaison to IEEE and they responded to the IESG. The IESG is considering this response. Liaison statements are all publicly available online if anyone wants to look at them. Look in the bottom left corner of the IETF home page. Reviewed milestones in Charter: completed, overdue, and future. TRILL document status was reviewed: One RFC, RFC 5556, a problem and applicability statement, was published some time ago. The TRILL base protocol specification was approved as a standard 15 March 2010, a little over a year ago. To implement TRILL you also need the IS-IS code points and data formats. Those are in two drafts which have been approved recently as standards and all three documents are in the RFC Editor's queue. Other documents in process include the adjacency document, a MIB draft, a draft on options / header extension, and three OAM drafts that will be covered in a presentation below. There are also two unposted drafts, one an update to the adjacency document and one on Appointed Forwarders and inhibition. IEEE Code points: Ethertypes have been allocated for TRILL and L2-IS-IS and a block of multicast addresses for TRILL use. Also an NLPID (Network Layer Protocol Identifier) has been assigned for TRILL. There was a TRILL plug fest in August 2010 at the University of New Hampshire Interoperability Laboratory. UNH IOL is working on scheduling another one. ============================= Rbridge MIB Update, Anil Rijhsinghani (HP) draft-ietf-trill-rbridge-mib-02.txt The author was unable to make it to the meeting. Donald presented his slides. There were no questions or comments. Erik Nordmark: This document should be ready for Last Call. Donald Eastlake: Anyone object to a last call for the MIB document? There were no objections in the room. Erik: We'll confirm that on the mailing list. ============================= OAM and BFD Support for TRILL, Vishwas Manral (IP Infusion), David Bond (UNH IOL), Donald Eastlake 3rd (Huawei), draft-eastlake-trill-rbridge-channel-00.txt, draft-bond-trill-rbridge-oam-01.txt, draft-manral-trill-bfd-encaps-01.txt First Presentation: Mostly a condensation of slides shown in Beijing on OAM. Unknown: Does BFD already have an Ethertype? Donald: No, as far as I have been able to determine, there is no BFD Ethertype. If you are sending it through the proposed RBridge Channel, you don't need one. BFD would be identified by the protocol code point in the RBridge Channel Header. Second Presentation: TRILL OAM Created by David Bond Presented by Donald Dinesh Dutt (Cisco): I have a question about this ICMP facility draft. This is not about the Channel draft or the BFD draft, just the other draft. Earlier you said you can't use IP because RBridges are not required to have an IP address and you can't use IEEE OAM for other reasons. I buy the argument against IEEE OAM but, practically, every RBridge will actually have an IP address, whether you like it or not, to manage it, ... Donald: In the real world, I agree. Dinesh: If every rbridge has an IP address, then why can't I just construct some IP UDP ICMP ping frame and send it with the TRILL TTL set to 1 and then keep increasing the TRILL TTL getting an ICMP back until it gets through? Donald: You can. Dinesh: Then why is all this OAM stuff needed. Donald: You need it for other things like error report and your way means you are forever requiring every RBridge to have IP. Dinesh: Sure, but the question I would like to ask the audience is, is that so bad? Practically, every RBridge will have an IP address for management... Erik Nordmark: Are you saying you increment the IP TTL? Donald: No, he is saying you set the IP TTL to some large number but you increment the TRILL TTL each time. The entire campus path is just a single IP hop. Dinesh: You just set the TRILL TTL, not the IP TTL, because RBridges do not change the IP TTL. Erik: But you still need the error reports coming back. Dinesh: You just use the ICMPs coming back. Erik: But now you are overloading ICMP error to be about TRILL TTL as well as about IP TTL exhausted. Donald: The other question I have is how do you then "ping" an RBridge? Dinesh: There are two ways but I hear people in the audience saying "no, no" that you do not want to require every RBridge to have an IP address. The question I have is, do we need to define anything further if you can use this ICMP traceroute? I assume it is something that just works. Donald: I agree with that, but you also seem to agree that certain kinds of error reporting are needed that are not provided at the IP level. So I see the question as being whether to take out the traceroute provision in this draft. Dinesh: No, I don't want to take anything out yet. I am just trying to understand. Can I do it? If not, why not? And things like that. Has someone looked at this and says you can't do it? I would like to know. Donald: I think you can do exactly what you are proposing but I would still prefer to leave the traceroute facility in the draft so it is possible to have an RBridge without an IP address. Dinesh: I think we should take this up on the mailing list. But on another thing, you say there are other errors that need to be reported, can you give an example? Donald: Well, the first one that occurs to me is "illegal egress nickname". An RBridge receives a TRILL Data frame but the egress nickname is unknown or reserved. Dinesh: OK. So like if I sent a UDP packet to a non-existent port, I'd get an ICMP back. This is like that for RBridges. Donald: Yes. And there are other possible errors like "administratively restricted" which would mean that your frame looks fine but I'm not going to deliver it anyway. Puneet Agarwal (Broadcom): Just responding to Dinesh, TRILL is a separate layer and, if it wants to do OAM, it needs to use its own name space. And it has its own name space and it is a clear layer violation to mandate an IP address for TRILL to work. As a practical matter, we could choose to do that. But as a separate layer, it should have its own name space, which it already has, and should have all the capabilities it needs at its layer. Donald: OK, however we must move on due to limited time. Third Presentation: Vishwah Manral (IP Infusion) who could not attend so presented by Donald. ...BFD provides low overhead continuity testing messages. Because lower overhead, can be sent more frequently... After all three presentations: Donald: Would like to change name of the OAM Channel to “RBridge Channel”. Actually, it isn't yet a working group draft so that could just be done. But I'd like people to agree. And I also think that this set of three drafts should be made working group drafts. They all need some work and polishing, but I think they are a good starting point for TRILL OAM. Erik: We talked about these drafts in Beijing. There have been some revisions and they do need some more work but they seem like a reasonable starting point. Anyone opposed to making these WG draft? [no one] Anyone in favor of making these WG drafts? [several people] Any other questions or comments? Dinesh Dutt: Is there interest, in addition to these three draft, having a draft on using the IP address mechanism? It would not say to not use the other drafts, just provide an alternative. Does anyone object to adding such a draft other than that someone has to write it? [no one] Does anyone support adding such draft? [no one] Donald: Anyone is welcome to write such a draft. I think that after such a draft existed, people could more easily judge whether they wanted it... Erik: The draft would have to work out a few details like when you send an IP TTL expired ICMP and when you send a TRILL TTL expires ICMP. Dinesh: Sure. We can be very pure and wait until things are perfect or we can go ahead with what we have and solve the 80% case. Today if I want to do ping and traceroute, I would use the IP method so I don't have to wait for these documents to be finished and I don't have to say TRILL has no OAM. Thanks. ============================= Server Assisted TRILL Edge, Linda Dunbar (Huawei) draft-dunbar-trill-server-assisted-edge-00.txt Linda Dubar (Huawei): Little mistake: title should be Directory Assisted Edge, not server assisted edge. Purpose of presentation is to see if others are interested in this approach. Data Center network is simple and regular compared with general IP network. So maybe we can use different techniques. ... Virtualization increases number of servers to route to. ... Subnets span multiple shelves ... Linda: Where Virtual Machines are placed is centrally controlled. ... Can use that to do a better job than learning addresses from observing data going through ports. ... Dinesh Dutt: There are lots of edge devices and if there is a MAC lookup miss, it will result in flooding that will cause excessive address learning. Is that the problem you are talking about? Linda: It is more than that but go on with your question. Dinesh: If that is the problem, there are lots of ways to solve it without requiring any work from a protocol or any standardization. So right now Cisco boxes solve this problem. You can have a Cisco RBridge and a Juniper RBridge -- wait, Juniper does not do TRILL -- you can have a Cisco RBridge and a Huawei RBridge connected together and you can optimize the MAC learning on one without requiring any changes in the other. If something can be solved without a standards working group being involved, that is simpler. And I'm afraid TRILL is heading down a path of increasingly complicated features. You can solve MAC learning without protocol changes. Linda: Actually, it is more than that. You also avoiding flooding. If you don't know where a MAC address is you flood the frame and the return frame tells you where it is. Then you can send future frame there directly. But there is this directory that already knows where it is, so you should be able to use the directory. Donald Eastlake: I believe this idea does not require any changes to the current TRILL spec. It involves turning off some address learning, which is already provided for in TRILL. And it involves some box before the ingress RBridge pre-encapsulating the frame to the egress it looked up, which is also provided for as a standard option. Erik Nordmark: TRILL has thought about how to do this more efficiently and that is why we have the optional ESADI protocol in TRILL. Also, we have a Charter item to reduce broadcast and multicast including ARP/ND and earlier documents on this used ESADI for that purpose. That is a different idea as it does not assume centralized knowledge. But I take your point that if centralized knowledge does exist, as in a VM controller, it should be simpler to leverage that. But it is also more specialized in that it assumes the existence of central knowledge. Brumada/Murari? (Microsoft): In response to Dinesh, I agree that there may not be work needed in TRILL but TRILL should be careful not to preclude use of this idea. Existing VM Managers have some protocol already that they use to communicate with hypervisors, etc. Ali Sajassi (Cisco): The flooding of ARPs is a big problem and there is a big incentive to improve it. Have you studied how much flooding there is? ============================= RBridges: Multilevel TRILL, Radia Perlman (Intel Labs) draft-perlman-trill-rbridge-multilevel-01.txt Radia Perlman (Intel): This presentation is not "here is a design" but rather thinking through the key issues and talking about alternatives. There are two general approaches: unique nicknames and aggregated nicknames. ... What is limiting scaling? ... multi-level IS-IS ... area addresses ... backward compatibility ... multiple border RBridges -- one must be in charge ... Dino Farinacci (Cisco): But you can have multiple entry points for data? Is it multi-destination or active/backup? Radia: Absolutely. Unicast can use all the border RBridges. One idea of aggregation is that the more you aggregate, the less optimal your paths are... You can have the border RBridges advertise their cost to other areas... Dino: Why do you need a new method, why not use the attach [attached to Level 2] bit that ISIS has? Radia: You could do that but they you can't optimize routes or know what areas exist or don't exist. In CLNP it was clear what addresses were not in your area. Dino: I was just asking, and this goes back to 1987, whether you were using the attach bit or default route like CLNP or injecting specific routes like we did in IPv4, ... are you taking one or the other or combining them? I'm not clear... Radia: I'm not being clear on purpose. I want people to understand the trade offs, I don't want to say "This is the solution..." The more scalable you make it and the less information you inject, the less optimal the routes are. Personally, I'd probably favor more scalable ... But I'm mostly just describing the alternatives... Radia: On trees: If you make a tree in each area and in Level 2 and exactly one RBridge connects each area tree with the Level 2 tree, then you will have one bit tree. Dinesh Dutt: Does there need to be a single RBridge for all trees? Radia: I'll talk about that. In single level TRILL we wanted to load split multi-destination traffic so you can have multiple trees, say three, and the ingress RBridge chooses which one a multi-destination frame is distributed on. And you can prune the trees.. Radia: Say you have 100 areas. If you calculate three trees for the whole network and they are all rooted outside an area, then, if there is only one border RBridge for that area, they will all be the same tree in that area. Even if there are multiple border RBridges, you have less flexibility. Radia: More generally, you can have each area calculates the trees it wants. Not constrained to border RBridges as roots. Then the border RBridge that is transitioning the frames maps between Level 1 and Level 2 trees. An interesting question is how do you decide which RBridge will transition a frame? Radia: RPF state is also an interesting question. At an RBridge in the middle of an area you need to see if the port that a multi-destination frame is arriving on corresponds to its ingress and the tree it is on. So, inside an area you would need to know which border RBridge transitioned a frame from Level 2 into that area. The "ingress" is actually way off inside some other area. Dino: Remember multicast OSPF? There was concern about reverse path versus forward path and unicast following the same path a multicast considering that metrics are only available in one direction? Radia: Sorry, but I'm a bit panicked about time... But I can tell from your interest and enthusiasm that you would like to help design this. Dino: That's an incorrect assumption. Radia: With RPF state, the aggregated nickname thing is very nice because the aggregated nickname aggregates the RPF state also. There is less state, including less forwarding state, using aggregated nicknames although you could change it to ranges, perhaps. Radia: OK, say there are multiple border RBridges. How do you pick which forwards a multi-destination frame? I sort of favor just one. It's simpler. If multiple border RBridges can transition a multi-destination frame into an area, the RBridges inside that area need to know which one did it for the RPF check. Could be per VLAN, per tree, or whatever. This adds a bunch of complexity and I'm not certain it would hurt much to have only one. Erik Nordmark: Go back to slide 3, with the list of 6 things. This presentation is about solving the first four. Is there interest in solving these? [About 6-12 people showed hands in favor.] Dinesh: I think the fourth, running out of nicknames, should be separated and considered separately from the first three. Erik: Is there interest in pursuing Multilevel TRILL or is it an academic argument? Dino: I'm afraid this may be academic because if you are building a layer 2 network that big, why not use IP? Radia: The trouble with that is that if these are all IP routers, then when you move your IP address changes. That's why people like a larger layer 2 thing. Dino: There are lots of people working on various solutions to that at layer 3. Donald Eastlake: I personally own IPR that is in this draft. It is my intent to license it on a royalty free basis for use with TRILL. I will file a disclosure with the IETF in a couple of weeks. ============================= VRRP for Rbridges, Hongjun Zhai (ZTE), Fangwei Hu (ZTE) draft-hu-trill-rbridge-vrrp-00.txt Fanwei Hu (ZTGE): Extend VRRP to improve the rapidity of recover when an RBridge fails. ... Vashkin?: Why is VRRP needed? Convergence time is much less than detection time. Could compute a backup path in IS-IS and just switch to that instead of having an additional protocol. Fangwei: I believe this will offer faster recover time in the event of a RBridge failure. ============================= TRILL Header Extensions, Donald Eastlake 3rd draft-ietf-trill-rbridge-options-04.txt Donald: Header extensions. Reviewed what was agreed to in Beijing. Donald: After the Beijing meeting it was agreed on the mailing list to re-name "bit encoded options" to be "header extension flags". Donald: There are basically two ways to extend the header. One is to add more or less fixed fields to the header, the other is a more tag like fashion with tags later in the frame. Donald: Request to change “options” to “trill header extensions”. Dinesh: I prefer "options" to "extensions". Donald: I'm proposing "header extensions", not just "extensions". Dinesh: I guess that's OK. But why do you want to do this? Every other protocol calls them options. Radia: It's just words. But's I think TLVs are usually called options whereas things that are more of a fixed field added on are typically called extensions. So if you have both, it might be better to use different words. Dino: Do they have to be implemented? Donald: No. Dino: Then they are optional. Donald: Yes, I agree that the extensions are optional. But I think this is editorial. I just brought it up to be safe since this is a WG document. Unless there are more objections, I plan to do it. [There were no additional objections.] Erik: Just a note to people: There will be a presentation on the status of the adjacency at the IS-IS WG meeting right after this meeting. They requested an update so if you are interested, you can go to IS-IS and hear that. ============================= VLAN Load Balancing, Mingui Zhang (Huawei) draft-zhang-trill-vlan-assign-00.txt Mingui Zhang (Huawei): Adaptive VLAN assignment for Data Center operators. Data center is highly dynamic. Multi-access is an important aspect of TRILL. ... Current TRILL protocol does not describe how to pick appointed forwarders for a link. Imbalance is possible. Mingui: Balance in number of VLANs does not guarantee balance in traffic. Require appointed forwarders to report the number of MAC addresses in a VLAN and amount of traffic. Mingui: We will refine this draft and specify it further. Donald: Any questions or comments? [none] Erik: We seem to have ten minutes left. Why don't you give the adjacency document status presentation. ============================= TRILL Adjacency Document Status, Donald Eastlake 3rd (Huawei) draft-ietf-trill-adj-05.txt Donald: OK, this is a limited scope draft. ... The document has been last called on both the TRILL and ISIS mailing lists and all the last call comment resolution messages appeared on both mailing lists. The only change from the -04 and -05 drafts was to remove a part that was out of scope. -05 has not yet been posted but could be this evening. The draft fully documents how to handle duplicate MAC address ports on a link. The method used could be applied at Layer 3. Erik: Any questions? [none] Erik: We seem to be ended a few minutes early. Will that cause us to get less time next meeting? Donald: I don't think so. Thanks for attending. See you the mailing list. [Meeting Ends]