BFD Minutes
IETF 102 - Montreal
Wednesday July 18th, 15:20-17:50
Chairs: Jeffrey Haas, Reshad Rahman

Meetecho: https://play.conf.meetecho.com/Playout/?session=IETF102-BFD-20180718-1520
          https://www.youtube.com/watch?v=TsEjkh3s3MY

6m10s in recording
[Note well] 6m25s in recording

[Agenda] 6m35s in recording

[Chairs Slides (Jeff and Reshad)] 7m00 in recording
BFD YANG and BFD multipoint docs in IESG LC.
WG LC for 3 authentication drafts.

BFD authentication drafts: 8m15s
    Greg Mirsky: periodic use of authentication.  Once session is Up, if authentication validation fails, what happens then? If we have one authenticated packet followed by several non-authenticated packets, RFC5880 says BFDs session is still up.  If auth fails, we should do something?
    Mahesh Jethanandani: view optimized authentication as not changing the BFD protocol any way.  One packet in an up state is marked for authentication but fails auth. Protocol says it has to not received its detection multiplier, only then will the session be marked down.  If one packet fails, but other passes, session stays up.
    Greg Mirsky: I agree per existing specs, that's what supposed to happen.  But then why do the periodic authenticated if there’s no impact on state of session when an auth packet fails validation?
    Mahesh Jethanandani: if the auth failed, then why? The periodic authentication was added a bit late in the document's life again MITM attacks.
    Greg Mirsky: Authentication is to protect the BFD session but we don’t take any actions when a BFD authenticated packet fails validation. Feel there is a gap in the logic of the proposal. I see 2 options 1) periodic doesn't really help, let's remove it. 2) If auth fails (periodic mode) in Up state then we should view this as session down.
    Jeff Haas: one possibility is to send detect multiplier number of consecutive authenticated packets, then the session will go down if authentication packets fail validation.
    Greg Mirsky: does that require use of P/F procedure?
    Jeff Haas: P/F is not needed, we send consecutive authenticated packets
    Mahesh Jethanandani: we can make that change.
    Reshad Rahman: is there a need to do the same for state change?
    Mahesh Jethanandani: every state change packet has to be authenticated.
    Jeff Haas: for pedantic reasons, when making the state change we should do the same. On P/F sequence we should do    this for the new multiplier if it’s longer. We’ll take this back to the mailing list.

BFD VXLAN was adopted recently. We were waiting for implementations, we have 2. Cisco has published IPR against this draft (standard Cisco IPR). We will do WG LC shortly on this document and get NVO3 WG to pay attention to this document.

BFD unsolicited. Some implementations support this already.
    Jeff Tantsura: YANG model change needed?
    Jeff Haas: yes. Enke, still looking for this to go through adoption?
    Enke Chen: yes, motivation for route servers. Multiple options there.  This is the simplest one which some people don’t realize is easy to implement. This could be against the base spec or we do informational draft. One application is static route and the other is route server.
    Reshad Rahman: regarding YANG model, no decisions on what is needed - full bis, or YANG model extensions in the draft.
    Enke Chen: we just need a configuration knob.
    Naiming Shen: are there prior examples of YANG for feature extensions to existing protocols?
    Reshad Rahman: don't think so but I can’t ask and there are enough experts to help you out.
    Enke Chen: is the BFD YANG complete?
    Reshad Rahman: yes it’s in IESG
    Enke Chen: not clear where to make the YANG change since the unsolicited draft is informational
    Jeff Haas: then we need to change status of the document (can’t be informational)

5884-bis. Jeff Haas has contacted the original authors and they’re willing to help.  Minor clarification for LSP-Ping bootstrapping mechanism.

Jabber: Greg Mirsky supports BFD VXLAN WG last call.  Will respond to comments.

Adoption call for BFD mpls demand (draft-mirsky—bfd-mpls-demand)
    Jeff Haas: this is core mechanism for BFD, there seems to be an IPR, hope Greg can clarify?
    Greg Mirsky: I’ll get to IPR disclosure. Multipoint BFD uses demand mode for IP. Proposal for demand mode for MPLS (RFC5884).
    Jeff Haas: I don’t think this addresses my comments about IPR, will take it to mailing list

Jeff Haas: documents to watch elsewhere
Greg Mirsky: p2mp VRRP use case, RTG WG chairs decided that results are inconclusive because of low response. I’d encourage people to read the draft and post their thoughts on RTG WG alias.

[BFD Yang Update (Reshad Rahman)] 30m15s in recording
IANA module issue:
    Martin Vigoureux: we don't have to fix this until we get a clear message that we should.  Trying to get a rule defined IETF-wide before we change this. Need to discuss this with the NETMOD people. Trying to push for no-change here.
    Reshad Rahman: have you discussed with NETMOD chairs?
    Mahesh Jethanandani: if we clear the other discusses we should be able to publish?
    Martin Vigoureux: Discuss has to clear.
    Jeff Haas: When is the is discussion likely to happen?
    Martin Vigoureux: you (Jeff/Reshad), me, and Alissa.
    Mahesh Jethanandani: could you clarify how NETMOD is involved?
    Reshad Rahman: this is an IETF-wide consideration.  NETMOD would be the one involved for the name change.
    Martin Vigoureux: wouldn't say "approve" wrt NETMOD, but definitely NETMOD would be involved in the discussion.
    Jeff Haas: may want to update the best practices document (6087-bis)
Jeff Haas: had discussion with Benjamin that putting security considerations for the BFD clients in BFD is the wrong spot.
    

[BFD Multipoint drafts(Greg Mirsky, remote)] 37m15 in recording
Greg Mirsky: multipoint and multipoint-active-tail through IESG last calls. Still some open items being discussed. Connectivity v/s continuity, proposal is to switch to continuity.
Jeff Haas: use of TESLA is not appropriate for BFD multi-point
Greg Mirsky: draft does not say to use TESLA but to look at TESLA to give you an idea what to use.
Martin Vigoureux: regarding Ben Campbell's discuss.  Ben didn't understand why active tail is updating multipoint.  Why they were being progressed at the same time.  To resolve this: active tail doesn't update multipoint, in multipoint add reference to active tail?
Greg Mirsky: we need the UPDATE because active tail does make changes to multipoint.
Martin Vigoureux: do you need the UPDATE tag since you’re progressing both at same time.
Greg Mirsky: I believe so because multipoint is the base.
Martin Vigoureux: you can do that via normative reference which you have. This is a purely process DISCUSS.
Greg Mirsky: I have no problem with that if it doesn’t raise other eyebrows.
Martin Vigoureux: you might see an update from IESG son about the use of UPDATE tag.


[BFD for Large Packets (Albert Fu)] 51m30s in recording
Mahesh Jethanandani: one of the questions I wanted to make sure we were all clear on was that that if the MTU went over 1500 and it went through MPLS network, we may lose space in the packet due to labels. Is one of the considerations that we wanted to reduce the MTU size?
Albert Fu: issue is 'how low can you go?'  today, 1500, tomorrow 1400.  We're assuming 1500 is lowest common denominator on provider edge devices.  We just need a mechanism to detect that a particular link supports that size.  If BFD fails, we just want to mark it down and not use it.
Reshad Rahman: When we talk about the link, we're talking about single hop BFD.
Albert Fu: Most of them are single hop.  e.g. Sidney to US.  a lot of metro-E may carry the traffic over a backbone
Reshad Rahman: IP single-hop?
Albert Fu: Yes

Matthew Bocci: BFD detecting MTU issue.  It’s really detecting control packets being dropped?
Albert Fu:  Could be connectivity issue, but really the MTU issue being detected.

Rick Taylor: Really like direction this is going.  How does this interact with don't fragment bit

Naiming Shen: BFD for faster failover, control plane is too slow.  Do this in data plane.  Normally BFD detection for all lost traffic.  For MTU issue, it doesn't change all that often.  Doesn't need to be that fast. Use something similar to authentication - do it occasionally.  Don't need to have such a fast pace.  For something like LTE link, don't waste the bandwidth.  
Albert Fu: would use the same timers as the protocol timers.  they're control plane, we can't go too fast with those anyway - don't want false alarms.  But still sub-second.  BFD interval of 100ms considered in one of our deployments.  This problem doesn't really happen a lot.  About once a month.  But when it does happen, we want to resolve it.  If we do this manually, we run scripts with ping.  It's control plane, and thus low reliability.  It can take 15 minutes just to do our manual.
Jeff Tantsura: for 3.3ms applications with 1500 bytes, we'd kill performance. Talk to guys like Broadcom, Juniper, etc.  Figure out the slow BFD times.
Jeff Haas: we know that we can't scale like that.  Have discussed with juniper engineering. We know we know to go slower.
Albert Fu: OAM - (Jeff - BFD is OAM for me), in our case it's CFM.  Not all SPs support OAM (IEEE).  We know BFD would work since it's IP.
Jeff Tantsura: Talking about logical OAM on silicon, including BFD
Reshad: This is for single hop.  BFD echo would do the trick.  
Jeff Haas: yes.  xiao min draft I worked on.  Not saying any of these is the specific solution
Albert: BFD echo is not common - 1 out of 3 vendors.
Reshad Rahman: how do you configure this?
Albert Fu: as users, if we know both routers do it, we provision it.
Reshad Rahman: some implementations might drop padded packets. 
Jeff Haas: open discussion point. Is this motivation for BFD v2? See this as similar to OSPF authentication
Mahesh Jethanandani: 2 points. platforms for 1500 byte packets, might require redirect for special handling.  needs to be in operational considerations.  Like idea from Naiming as occasional packet. Use it as a periodic probe.
Jeff Haas: similar to xiao min draft

Greg Mirsky: similar to optimize authentication (Mahesh) may be interested in BIER MTU discovery from Stig Venaas. Usability of path MTU discovery protocols. Aggregation of main MTU, and advertisement of MTU by IGP.  Second point is if we intend to use multi-hop BFD that BFD session follows same path as data flow.
Albert Fu: IGP to MTU, a control plane type signaling rather than data plane.  We want to verify the data plane rather than what the control plane asserts.  wrt path MTU, we do want to try send 1500 byte packets, we don't want to actually utilize PMTU - want the full size, because it impacts the service.
Greg Mirsky: because you want to be guaranteed, you want to do MTU monitoring on all physical links.  BFD follows best route, how do you monitor all possible paths proactively before switching to them?
Albert Fu: we're only trying to protect cases where we'd be doing single hop BFD.  Want the alarm to go do something.  today, we're in the dark.
Rick Taylor: following up on Greg, you're in an overlay network, in multi-path there's no way to recognize how things are getting exercised in the underlay.  doesn't matter in his use case.
Jeff: with this mechanism we can actually go use this to assert e.g. that the CPE link has MTU 1500 as advertised. Or take component link out of  LAG if BFD fails on the component link.
Matthew Bocci: worried about complexity for bursting, but like the idea.
Albert Fu: some vendors have concept of BFD dampening.  Suppresses flapping client protocols (e.g. 60s)
Rick Taylor: big frames tearing down sessions. Sequence numbering draft/RFC.  There are ways to work around flap.