[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [tsvwg] comment on draft-ietf-tsvwg-ecn-tunnel
David,
The saw it is a-bluntening :) ...
At 20:53 06/11/2009, Black_David at emc.com wrote:
Bob,
> >Clearing ECT(1) to Not-ECT doesn't get logged or raise an alarm,
> >but clearing ECT(1) to ECT(0) does, while clearing CE to Not-ECT
> >doesn't.
[BB]: OK, now I understand your 'clearing' terminology (something
happening at intermediate nodes, not at the egress). I also think I
now understand the nub of our disagreement. (For brevity below I use
'danger' to mean 'suppressing congestion indications' and I use
'alarm' to mean 'log and alarm'.)
1/ You are using alarms to hilite potential danger, whereas I am
using alarms to hilite any protocol violation, whether dangerous or not.
The troublesome cases are where a combination of headers could
indicate danger, or it could be the result of legal legacy ingress
behaviour. This leads to the second nub of disagreement (I know, you
can't have two nubs ;)
2/ You don't want to raise alarms unless we can raise them
consistently for all potentially dangerous cases, whereas I want
alarms in all cases where we can be *sure* there is a protocol violation.
I suggest the following compromise rules in this order:
a) Don't alarm combinations that might be valid
b) Alarm combinations that are potentially dangerous and always invalid
c) Only alarm combinations that are invalid but not dangerous if they
can be done consistently.
I'll turn these rules into a proposed new Fig 4 at the end.
> What's the reasoning for removing them? I'm happy to remove the two
> alarms for mixed ECT, just because the case for them is marginal.
I think those both need to be removed, as Phil wants one of them
removed, and I want both of them removed if one is removed, leaving
one last case ...
> But ECT(1) outer with CE inner is an unambiguous error case.
I agree that it's an "unambiguous error case" - my concern is whether
it's worth detecting in isolation.
> We don't have to alarm any of these, but we ought to have a reason.
I'm trying to figure out "who cares?" (aside from us), i.e., is the
alarm useful by itself?
[BB]: We should care, because we are messing with IP itself - the
neck of the hour-glass. If a header combination is unambiguously
wrong, it means there's some bit of kit out there that needs to be
tracked down and shot. Alarms say that to an operator. So we should
continue trying to reach technical consensus rather than compromising
through some voting algorithm.
Put another way, given these are all "SHOULD log and MAY alarm," why
do you care so much to not mention one or two, when an implementer
doesn't have to implement them anyway?
I come up with two scenarios:
- In an ECN environment, the problem situations of concern are loss
of congestion indications. Any change of Not-ECT (inner) to
something else is the big one; we agree on alarming that.
Secondarily, clearing of Inner:CE to something else in the
Outer also drops congestion indications, but 3168 prevents
default alarms on two of the three cases [Not-ECT and ECT(0)],
so I'm having difficulty seeing the value in alarms for
1 of 3 error cases.
[BB]: It's like saying "Better consistent than safe", whereas safety
should override consistency. Put another way, it's like saying "Don't
tell the child not to stroke a doberman unless you have always told
the child not to stroke alsatians".
- In a PCN environment, any downgrade of either CE or ECT(1) should
be alarming ;-), as both carry congestion indications. There
are 5 downgrade cases (CE -> anything else; ECT(1) -> ECT(0)
or Not-ECT), of which only 1 will carry an alarm. Alarming
1 out of 5 error cases seems even more unproductive by
comparison to not alarming any of them ... and instead
writing an appendix that suggests all 5 cased could/should
be alarmed in a (closed system) PCN environment.
[BB]: Losing ECT(1) is less dangerous than losing CE, because
congestion will rise until CE kicks in. That leaves 3 cases that are
potentially dangerous (CE -> anything else). 2 of these could be
valid (CE -> ECT(0) or Not-ECT), so by rule a) they have no alarm.
Even tho Inner:CE; Outer:ECT(1) is invalid, one might think it's
safe, because the egress will revert it to CE. But it is a sign that
some crap implementation is (probably) changing CE to ECT(1). It
might be doing the same to non-tunnelled packets too - potentially
dangerous. This is both invalid and potentially dangerous, therefore
by rule b) it gets an alarm.
Finally, the two with an ECT inner and a different ECT outer are not
potentially dangerous. One might be valid if PCN is being used but
the other is always invalid. Therefore, by rule c) neither get
alarmed, because neither is dangerous and they can't be done consistently.
In summary, I propose we have Fig 4 like this:
+---------+------------------------------------------------+
|Incoming | Incoming Outer Header |
| Inner +---------+------------+------------+------------+
| Header | Not-ECT | ECT(0) | ECT(1) | CE |
+---------+---------+------------+------------+------------+
| Not-ECT | Not-ECT |Not-ECT(!!!)|Not-ECT(!!!)| drop(!!!)|
| ECT(0) | ECT(0) | ECT(0) | ECT(1) | CE |
| ECT(1) | ECT(1) | ECT(1) | ECT(1) | CE |
| CE | CE | CE | CE(!!!)| CE |
+---------+---------+------------+------------+------------+
| Outgoing Header |
+------------------------------------------------+
Grounds for consensus?
If so, I will also modify the last bullet of Section 7 to roughly
reflect the above abc rules (the alarms part of the design guidelines
for alternate ECN tunnels).
Bob
Thanks,
--David
> -----Original Message-----
> From: Bob Briscoe [mailto:rbriscoe at jungle.bt.co.uk]
> Sent: Wednesday, November 04, 2009 12:03 PM
> To: Black, David
> Cc: philip.eardley at bt.com; tsvwg at ietf.org; Black, David
> Subject: RE: [tsvwg] comment on draft-ietf-tsvwg-ecn-tunnel
>
> David,
>
> On alarms - our final issue...
>
> We're agreed a Not-ECT inner with anything else in the outer gets
> logged/alarmed.
>
> At 21:24 29/10/2009, Black_David at emc.com wrote:
> > > FWIW, this decision about whether ECT(1) can be another form of CE
> > > will probably also mostly resolve the dangling topic of what
> > > transitions may generate alarms.
> >
> >That issue surrounded what to do when the inner and outer headers
> >have different ECT values. PCN uses Inner:ECT(0)/Outer:ECT(1), but
> >has no use for the other case, Inner:ECT(1)/Outer:ECT(0).
> >
> >The log/alarm behavior in Figure 4 of the -04 draft is somewhat
> >alarming ;-) ;-).
> >
> >Clearing ECT(1) to Not-ECT doesn't get logged or raise an alarm,
> >but clearing ECT(1) to ECT(0) does, while clearing CE to Not-ECT
> >doesn't.
>
> Eh? I think somehow you're reading the table wrong, or typing wrong,
> or I'm misunderstanding what you mean by 'clearing'.
>
> Clearing ECT(1) to Not-ECT does raise an alarm in Fig 4. There's no
> case where ECT(1) is cleared to ECT(0). And no case where CE is
> cleared to Not-ECT.
>
> Ug.
>
> >Before you write a long response, I'm aware that the
> >explanations of what does vs. does get the log/alarm treatment
> >involve what happens when a new tunnel decapsulator is paired with
> >one of the two RFC 3168 encapsulator variants,
>
> The reason I added these alarms is actually not to do with pairing
> with 3168. They are just cases that have always been errors, but 3168
> omitted to log/alarm them.
>
> What's the reasoning for removing them? I'm happy to remove the two
> alarms for mixed ECT, just because the case for them is marginal.
> But ECT(1) outer with CE inner is an unambiguous error case.
>
> We don't have to alarm any of these, but we ought to have a reason.
>
> >but the result is
> >still strange and possibly of limited value for PCN.
>
> True.
>
> Cheers
>
> Bob
>
>
> >Let me suggest a blunter alternative:
> >
> >- Only log/alarm (!!!) by default the transitions away from Not-ECT
> > in the inner header (first line in Figure 4, 3 right hand
> >boxes).
> > These are just plain wrong, no matter what.
> >- Do not alarm any other combinations by default. That puts a rapid
> > end to this discussion.
> >- Provide an explanation (in an Appendix?) about what to alarm in a
> > controlled PCN environment. The short summary is
> that because
> > PCN has to be deployed in a closed/controlled network, it's
> > reasonable for the operator to know (in some cases)
> that there
> > are no RFC 3168 encapsulators, and therefore be confident I
> > explicitly enabling a considerably more aggressive log/alarm
> > approach that catches all reduction/removal of congestion
> > markings by comparison to what's in Figure 4.
> >
> >If this is done, I think the current Appendix F can be bit-bucketed,
> >and that's probably an improvement.
> >
> >Thanks,
> >--David
>
> ________________________________________________________________
> Bob Briscoe, BT Innovate & Design
>
>
>
________________________________________________________________
Bob Briscoe, BT Innovate & Design