Layer 2 Virtual Private Networks (l2vpn) Chairs: Vach Kompella & Shane Amante One Session: IETF 74: San Francisco, CA, USA TUESDAY, March 24, 2009, 0900-1130 Agenda 1) Administrivia WG Status Web Page: http://tools.ietf.org/wg/l2vpn/ Mailing List: http://www.ietf.org/mail-archive/web/l2vpn/index.html Scribes Blue Sheets Agenda Bashing TRILL update WG last call, welcome feedback from L2VPN people 2) WG Status and Update Chairs 10 mins 3) VPLS MIB: CANCELLED Rohit Mediratta (rohit.mediratta@alcatel-lucent.com) draft-ietf-l2vpn-vpls-mib-02 10 mins 4) IPv6 Updates to ARP-MED Andrew Dolgonow (andrew.dolganow@alcatel-lucent.com) draft-ietf-l2vpn-arp-mediation-10 20 mins see slides for presentation snooping mechanisms at PEs to use ARP-like process over PW (for Ethernet and non-Ethernet CEs) different variants (ND, IND) current open issue: dual stack support document is ready for WG LC Shane: how many read the draft? ~6 persons more folks to look at the draft wish to issue LC within next month note Andrew comment that changes on IPv6 will be ported to the IPLS draft no question raised 5(a) MAC Flush Loop Detection in VPLS Pranjal Dutta (pdutta@alcatel-lucent.com) draft-pkwok-l2vpn-vpls-macflush-ld-00 15 mins see slides for presentation procedures to avoid MAC flush loop uses Path Vector TLV defined in RFC5036 to detect loop (PE ID present in the list) questions: Ali: if the split horizon mechanism failed, you have bigger problem than just MAC flush loop, e.g. generation of multicast adv. storm not forwarding to the end point that is part of the split horizon Pranjal: agree, but here the case is for instance if you have misconfiguration then the failure repeats, this mechanism is made to avoid this. Ali: VPLS misconfiguration is not enough to motivate this work Florin: the point here is you can have loop due to misconfiguration, you want to make sure that at the CP level you do not expand the problem, This may be one way to see the problem. 5(b) LDP Extensions for Optimized MAC Address Withdrawal in H-VPLS Pranjal Dutta (pdutta@alcatel-lucent.com) draft-pdutta-l2vpn-vpls-ldp-mac-opt-04 2 mins Shane asks the author to mail to the list the update on this draft. 6) Extensions to VPLS PE model for Provider Backbone Bridging Florin Balus (florin.balus@alcatel-lucent.com) draft-balus-sajassi-l2vpn-pbb-vpls-01 15 mins see slides for presentation define extensions to VPLS PE model to accommodate PBB-VPLS integration clarify the interaction between IEEE 802.1ah and VPLS specs have a clear demarcation between 802.1ah and VPLS modules in order to: a) describe only IETF related procedures; and, b) avoid describing IEEE related procedures feedback from the group asked authors believe draft is ready for WG document adoption Questions: Nurit: bridge device can be B-BEB? Florin: yes, can be any type of device Ali: provide a comment: this doc. is the general draft that describes the PE model, scenarios are described in another draft ???: you try to address PBB-VPLS scalability in the backbone, why not use different forwarders instead of single VPLS? Florin: to have H-VPLS forwarder you need to have hierarchy available in the DP, what we have done is start from available standards in IEEE, and propose method to map it to IETF VPLS model ???: you try to provide scalability of the MAC address table Florin: we are doing two things: 1) MAC table scalability; and, 2) service instance scalability Florin: to provide MAC scalability you can encapsulate client MAC's Florin: 2nd part is to provide multiple customer instance in the Backbone VPLS ???: Yes, but using multiple forwarders, then the service delimitation is at the PE Ali: From the PE viewpoint, the number of service instances is the same. The number of PW's is what really matters. Ali: You can have a full mesh of PW's among PE's. This approach provides a broadcast domain between PE's. ???: This is transparent to the PE's: PW are not visible. Ali: Yes, but PW's are managed entities, so the number matters. ???: similar for the I-component (?) Ali, Florin: No. Ali: You have a single AC per end point, not per service Shane: how many read the draft? ~20 persons among them, how many are in favor of WG doc? about the same number send mail to the WG list and take consensus from there 7) VPLS Interoperability with Provider Backbone Bridges Ali Sajassi (sajassi@cisco.com) draft-sajassi-l2vpn-vpls-pbb-interop-04 15 mins feedback from the group asked authors believe draft is ready for WG document adoption no questions raised. Shane: how many read the draft? ~6 persons among them, how many are in favor of WG doc? about the same number and how many are opposed to become a WG doc? no one take it to the list for approval 8) Multi-homing in BGP-based Virtual Private LAN Service Bhupesh Kothari (bhupesh@juniper.net) draft-kompella-l2vpn-vpls-multihoming-02 20 mins - Want to address questions raised at last meeting - Q: Is there a traffic impact for sites not affected by access link failure? A: NO. Irrespective of whether all sites are multi-homed or some are single homed and others multi-homed, there is NO traffic disruption for sites that are unaffected by the access link failure - Q: Is there a need to upgrade all PEs in the network that support multi-homing to avoid Layer 2 loop? A: NO. - Q: Can path selection computation be localized so that only multi-homed PE's run the algorithm? A: NO. Failures cannot be localized unless assumptions about traffic type and providers requirements are made. The key question is what are we trying to solve by localizing path selection? Questions: Ali: Would like to point out what could be the impact of the PW failure in this approach. In normal VPLS, when AC fails, there is no switch-over from active to stand-by PW. How much does this affect the packet loss in case of switch-over of PW? Bhupesh: this is implementation specific. Ali: I do not agree. In normal VPLS, if you have 10 AC's and one fails, the PW is still up, no switch-over, no loss. In this approach (draft presented), if an AC fails, you need to switch from the active to the standby PW, so this generates packet loss -- it is not hitless. Bhupesh: OK, but depends on how you implement it. Standby label is pre-programmed in data path. Yakov: Your point is very implementation dependent, it may be that in your implementation it is not hitless, but in others it is hitless. Ali: I disagree. It is not completely hitless -- there will be some impact, however small. Kireti: Normal/regular VPLS, what are you talking about? Ali: VPLS without dual homing. Florin: Comment on the LDP-based VPLS slide. To solve the issue presented, you can either do multicast snooping, or you set the second PW is stand-by, or Admin status down. Bhupesh: OK, but what you are proposing is to do path selection, we are using the path selection to actually select the "active" PW, this is the same process. Florin: The scenario you are presenting is a minor one, most of the time, you do not have that kind of scenario, so this is not a big issue. You do this unnecessarily for most of the use cases if you use path selection. An alternative is to use local path selection. There is no need to generalize the problem to all use cases. Shane: Let me try to summarize the comments raised. You are talking about making a local decision for selection of the active PW and then informing all remote sites vs. all remote sites have to come to the same conclusion themselves. Wim: You do not have to do the path selection algorithm at the remote PE, just simply select locally the active PW ... Shane: Right, compute the active PW at the two sites connected to the multihomed CE and then informing the remote site which PW it should use Florin: You do not have different service labels. Instead, the VPLS infrastructure is stable -- PW's are provisioned and not changing. Bhupesh: Is computation what you are worried about? Then this approach will work just fine. If you send BGP UPDATE's, these messages are flooded in the domain, you then enforce convergence. You achieve optimal situation. Why are you worried about BGP messages? Wim: Not an issue, only different approach. In this case the PW service labels converge, whereas in the other approach, the VPLS infrastructure is stable and you just inform the remote PE's what PW is used for forwarding. Bhupesh: With BGP path selection, you only to need to run it once at the start. Wim: OK, but if you look at the full solution, you have to resolve that in a way that you minimize the impact on the network, (VPLS PW infrastructure). There is an impact on convergence. Bhupesh: This is the general problem we are trying to solve. Wim: Indeed, this is not a debate on BGP or LDP-based approach. The main reason is because you can do it locally, you minimize the impact on the rest of the PE's in the network. Kireeti: I would like to take the discussion one level higher. You can use BGP Path Selection. If the issue is computation, look at your Control Plane CPU, because this really it is not an issue. Both solutions are pretty much similar, but the trade-off on extra computation on PE's is completely irrelevant. The solution proposes to distribute the computation among all PE's. Ali: What I would like to see in the draft is discussion on the scalability, VE-ID assignment, number of PW's, and comparison of those numbers with "normal" VPLS (i.e., without dual-homing). Scalability is a function of number of CE's rather than VE's. Changing the label is not hitless. Also, would like to see this discussed in the draft. Bhupesh: Will add text that discusses that. Florin: We should try as much as possible to merge the two solutions in one for dual-homing, whether BGP or LDP underneath is not the question. Bhupesh: Agreed. That's why we are re-using what is existing in BGP-VPLS. There is a big overlap with this work. Florin: Is the issue to be only compliant to BGP-VPLS? Why do you need to involve all other PE's in the path selection of the active PW? Bhupesh: BGP messages go to all PE's for Auto-Discovery, but this signaling is also sufficient to do path selection of the Active PW as well. You don't need another signaling protocol to determine the active forwarding path -- all the machinery is already available, so let's just use it. Florin: Maybe we can discuss this off-line. My question is why have you chosen to use BGP path selection for doing both functions? We should take a step back and look at why we can't use local path-selection. Kireeti: I disagree, if there is a "hit" because of (intrinsic) computation problem, then OK we need to solve it, but if it is only CPU power issue, then you have a bigger problem. Shane: Time out. The larger issue is should be can we solve the mult-homing problem with one solution, or do we need to design something that is specific to VPLS-BGP? Kireeti: We should look at how we should solve this independently for both VPLS-LDP & VPLS-BGP, then we look at if its possible to merge them. Localizing path selection is a red herring. Trying to figure out how to do fail-over without losing traffic and not duplicating traffic in the network on PW's that are not active are much more important questions that we should be looking at. If the two solutions to this problem look similar, then we can look at merging them -- I'm not opposed to doing that. We're optimizing for the hit on fail-over, not the hit on computation. Bhupesh: For BGP A-D, you have to flood these messages to all other PE's anyway and once we've done that we can re-use these same NLRI for path-selection, as well. Wim: There are other approaches, which we'll explain in next preso. Ali: 3 approaches to solve the problem. We can list pros/cons in each of them and discuss on the mailing list. Want to see discussion on scalability issues and hitless/data interruption impacts. BGP path computation should be secondary to those issues. ???, from Verizon: Can't we just put it in the CE? Wim: These solutions are focused on PE's, because there exists no signaling from PE to CE ... Shane: take it off-line, out of time (last slide of the presentation) Bhupesh: We have addressed all issues raised since Dublin meeting and want to propose adopting this as a WG doc. Shane: Given the amount of discussion here today and some of the fundamental questions that are out there, it seems a bit premature for this to become a WG doc. Let's resolve the issues on the mailing list, ask people involved to be responsive, and resolve this before next IETF meeting. 9) BGP based Multi-homing in Virtual Private LAN Service - (slides - .ppt) Wim Henderickx (wim.henderickx@alcatel-lucent.be) draft-henderickx-l2vpn-vpls-multihoming-00 20 mins questions Bhupesh: Is new NLRI used for Auto-Discovery? Wim: No. Only used to signal multi-homed information. Bhupesh: Why? Only have Site ID as addt'l info in the same NLRI? Wim: Today, with BGP A-D it's only giving information on the VSI. Wanted to create new NLRI to leave existing BGP A-D "as is" and then have this 'new' NLRI to carry multi-homed info. Bhupesh: Second question is about fast convergence. What do you mean by that: Control Plane or Data Plane and what is the mechanism to do that? Wim: Not meant to be faster than your proposal, just meant that the target for convergence was around 50 msec or so ... Shane: Overall, that number is a goal for any multi-homing proposal, yes? Wim: Yes. Goal was to be in the 50-msec range ... Bhupesh: I didn't see this in the draft. Wim: At the end of the day, that's the goal, but it will probably come down to an implementation detail -- there is nothing in the solution which would prevent us from doing that. The main reason we believe this is acheiveable is because we have an pre-established PW already in place to do that. Bhupesh: Does this also assume the 2 PEs are in the same AS? Wim: Yes, for now. I'll clarify this in the next rev of the draft. Ali: Given that Site ID's are configured locally within an AS, how do you avoid Inter-AS Site ID collision? Wim: Reqm't for both our draft and VPLS-BGP multi-homing drafts that the Site ID's must be globally unique. Ali: But, we didn't have this requirement with VPLS-LDP before. Wim: Because we didn't use Site ID before ... Ali: Right, and Inter-AS VPLS worked just fine ... Wim: But, there was no multi-homing solution for VPLS-LDP ... Shane: Don't you have a RD in the NLRI that could be used to make them unique? Wim: No, because that RD is used to make the Site ID unique on a per-PE basis, but that info is given to the Designated Forwarder selection and it is used ... Ali: You can have multiple collisions within the same VPLS instance and within the same RD. Wim: Right now, this is not addressed. We could do that. Ali: There are known issues with VPLS-BGP and I don't want to bring those into VPLS-LDP. Ali: Another question is transient loop. Operations for PW over the core is independent from the AC's. You could get into situation when two AC's go up simultaneously and you end up with a transient loop. Wim: That's one of the things I said we'd address in the next revision: startup scenarios, because we've seen cases for transient loops. Ali: In other approach, AC's are bound to PW, therefore when AC goes down PW goes down and then you bring up a new PW, so that problem doesn't exist there. Ali: MAC flush, if you are using BGP for MAC flush, why not use it also to distribute the labels? Wim: Why do we need it to distribute labels? Ali: Because that is the only thing left ... Ali: Scalability. The number of BGP-VPLS routes is going to increase drastically, because it's no longer a function of the # of VPLS VSI's, rather it's a function of the number of CE's. That's the big thing to pay attention to in both approaches. Wim: If you look at how BGP is being used for IPVPN, that's not that big of a concern in my opinion. Ali: We really need to get into the numbers. It could be a few orders of magnitude more VPLS BGP routes than what we have with the current non dual-homed VPLS routes. Shane: But, that's the price to pay for multi-homing ... Ali: I disagree. The proposal to avoid all the extra routing info will be discussed in PWE3. Rahul Aggarwal: What are the procedures in PE3 and PE4 when they receive the new multi-homed NLRI? Wim: They see that the new NLRI is not locally configured, so they simply ignore it. 10) Meeting Closes