[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Isis-wg] [rbridge] Why is MTU discovery important?



Like Jim said, and as numerous people have said in many previous emails on this thread,
padded Hellos do not work for TRILL, because it results in two RBridges not
seeing each other, and both forwarding to/from the link. Which causes loops
not protected by the TTL, and is therefore fatal. Layer 3 is different in
this case than layer 2. Again, as explained numerous times before, so it probably
won't help for me to explain it once more.

Not padding hellos does not "break IS-IS". The only thing it does is not
do the extra thing that the IS-IS Hello mechanism is doing at the same time
as finding neighbors, which is to test the MTU size.

In TRILL, although weird things may happen if the topology includes
a link that can't handle the MTU size, it's far less bad than
to have two neighbors not see each other.

So, repeating for the nth time....
Keeping the Hello mechanism exactly as it is in IS-IS today will not work for TRILL.
That should be the end of discussion. Not padding the Hellos obviously
works for robustly finding neighbors, and is a trivial change. The
only thing that gets lost is MTU discovery.

The only feasible choices at this point are:
a) only do unpadded Hellos, and live with weirdnesses due to not
testing MTU size, deferring MTU discovery to the future
b) do unpadded Hellos, and come up with a mechanism for MTU discovery now.
It would be easy to do so.

I'm happy either way. If we pick a conservative MTU size (and perhaps the value
that's already in IS-IS is conservative enough) and say in
the spec that LSPs MUST NOT be bigger than that, the probability of
weirdness as a result of not testing MTU size will be very small, probably
even zero.



Radia



James Carlson wrote:
Don Fedyk writes:
I have a different interpretation.   I think you just broke an IS-IS
safeguard by not padding Hellos.

The simplest thing you could do is fix the Maximum IS-IS Frame size at some
lower value and continue to PAD Hellos.

I disagree.  In the context of TRILL, it's *not* a safeguard at all.
It is in fact a problem.

Failing to get Hello messages through (because the "safeguard" can
stop them when there's a restriction) means that your entire network
goes down.

This is the crucial difference.  I realize that the padded Hello is
protecting against an inability to get LSPs through between peers.
However, the TRILL failure mode we're talking about is far worse, and
destroys everything in its path, including innocent bystanders.

To simplify, let's consider two cases:

  1. This new, smaller padded Hello is always small enough to fit on
     all of the networks anyone ever uses.

  2. The new smaller padded Hello fails to get through on some
     network.

In case (1), everything works fine, but it's also true that unpadded
Hellos would work fine as well.  It's a no-op.

In case (2), the "protection" offered by the padded Hello has now made
two RBridges invisible to each other.  They will both become DRBs.
They will both begin forwarding L2 frames.  The network now has an L2
forwarding loop, and it goes down hard.

Unless we make this new padding requirement absurdly low (say, 64
octets), such that it's not really useful in any sense but also just
never does any harm, I think we're making a mistake.