[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: draft-ietf-ospf-ospfv3-auth-04.txt
Mukesh,
Inline..
Mitchell
Mukesh.Gupta at NOKIA.COM wrote:
>
> Mitchell,
>
> Comments inline..
>
> > > Do you still think that we should mention it everywhere in the
> > > draft that the pkts will be dropped by the IPsec layer ? Or
> > > we should clarify it more in section 2 and make it explicit
> > > that whenever we say "drop the pkt", it means it is dropped
> > > by the IPsec layer ?
> >
> > IMO, Minimally it should be explicitly stated.
>
> Ok we will try to make it explicit in the next version.
>
> > > By the way, isn't it the way we handle authentication in OSPFv2 ?
> > > If you don't have any authentication and FULL adjacency between
> > > the neighbors and then you change the configuration of one of
> > > the neighbors to use simple/md5 authentication, the adjacency
> > > is not torn down immediately. It takes the dead interval time
> > > before each of them mark the adjacency down.
> >
> > First, what we are dealing here is online reconfiguration
> > or dynamic reconfiguration. I haven't seen really any
> > discussion in 2328, wrt changing values after 2-ways/
> > adjs are established.
>
> I haven't seen anything either. I tried our implementation
> and reported the findings.
>
> Are there any implementations that will tear down the adjacency
> immediately when the authentication method is changed ?
>
> > IMO, I have two very different ways of thinking about this.
> >
> > #1 If a field within a pkt is changed that "forces" the
> > link partner to now drop the pkt, the router dead
> > interval time frame is a built in latency period to
> > re-establish re-synchronization of the values. However,
> > this allows data movement accross the adj, where the
> > adj REALLY is no longer valid / synchronized. In addition,
> > if another router duplicates anothers router-id, but
> > changes one required synchronized value, to invalidate
> > the pkt, and this one pkt causes the adj to be dropped,
> > it could be a DOS type attack.
> >
> > #2 To properly achieve "Faster Failure Detection",
> > IMO Guyal, et al, should have considered the reception
> > of a NOW invalid pkt. This would eliminate the router
> > dead interval delay in tearing down the adj.
> >
> > It is the delay's / latencies that are built in
> > tearing down a now no longer valid adj.
> >
> > So, we should be leaning from OSPFv2 the behaviours
> > that we don't want in v3, and attempt to remove them
> > versus forcing us to live with our past ?mistakes? for
> > consistency sakes. Thus, IMO, if authentication is
> > CHANGED locally the adj should be torn down immediately,
> > and the reception of a pkt that would cause it to be
> > dropped by the recvr, should effect the state of the
> > adj. BTW, this should be only one of many fields that
> > should effect the state of the adj. Don't we have to assume
> > that the adj is going to fail anyway? Is the reception
> > of a hello pkt that will be dropped repeatedly,
> > the first indication that the nbr is no longer reachable?
> > I think yes. Thus, it is a valid nbr state machine event
> > to the down state. However, since it is so drastic,
> > I assume that a CLI command should allow this event.
>
> The current behavior is actually helpful during the configuration
> changes. Consider the scnerio when an admin is transitioning the
> OSPF (v2 or v3) network from "no authentication" to "simple
> authentication". Because of the current behavior, he/she gets
> enough time to change the configuration on all the routers without
> bringing the network (or data forwarding) down. If the router
> tears down the adjacency immediately, there will be a forwarding
> break in this case.
>
> IMHO, it is not worth tearing down the adjacency when the admin
> makes this configuration change just to notify him/her for some
> incorrect configuration. If the configuration is incorrect
> (mismatching authentication types on routers), the admin will
> know within minutes anyway.
Is it really helpful? Ask any customer whether having
his system that takes minutes to respond to a router CLI
change and I think you will loose that customer!
During this latency period that can be 1 hr,
no hellos are being responded to, no LSAs are being
sent or responded to, the DR doesn't acknowledge new
nbrs, etc,... All due to a CLI timebomb. If it happens
immediately, the admin should know that he just ran
a CLI change and this was the cause.
The latency time for the system to normally ack the
change results in longer than necessary LSDB
synchronization and failure detection.
If you listed your nbrs, the list would not actually
be correct, since you are no longer communicating
with your nbrs. Lets see. If I am a router and I
have my hello interval set to X secs, I can set up
adjs with one set of routers. Then I change it to
another value and get adjs with another set of routers.
Then if I just toggle the value back and forth within
router dead interval time frame, I will have adjs
with routers with different hello intervals.. Is
this proper operation????
If a change is done that is going to eventually drop
the adj, then why wait? If the change is a two step
process on a single router, then impliment a commit
type functionality.
I just think that this is an area that really needs
a standardized RFC that allows the customer to get
immediate changes to changed configurations.
>
> Regards
> Mukesh