RE: PLEASE ignore previous email, was sent accidently [Mip4] New Issue: 3102bis Stale Challenge may cause a DOS
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: PLEASE ignore previous email, was sent accidently [Mip4] New Issue: 3102bis Stale Challenge may cause a DOS



Title: RE: PLEASE ignore previous email, was sent accidently [Mip4] New Issue: 3102bis Stale Challenge may cause a DOS

Hi, Pete,
Comments inline.

Regards,
Ahmad

>
> Hi, Ahmad,
>
> Some comments below...
>
> Ahmad Muhanna writes:
>
> [snip]
>  > > So, you are saying that whether or not the FA has received a
>  > > response to the original RRQ from the HA, it forwards the new
>  > > registration request to the HA, not a copy of the old
> one, right?  > >
>  > [AM]
>  > YES.
>  > > You used the words "forwards the Registration Request to the
>  > > Home Agent again" but you didn't specify what you mean by
>  > > "the Registration Request".  The new one is not identical to
>  > > the old one, as it differs in both the Challenge and
>  > > Identification fields.
>  > >
>  >
>  > [AM]
>  > Let us read the following in RFC3012-bis under section
> 3.2:  > "  If a mobile node retransmits a Registration
> Request with the same
>  >    Challenge extension, and the Foreign Agent still has a pending
>  >    Registration Request record in effect for the mobile node, then
>  >    the Foreign Agent forwards the Registration Request to the Home
>  >    Agent again.
>  > "
>  > Is there any concern about that? Retransmitted RRQ always
> contain a new ID  > when timestamp is used.
>
> Good point; I guess I am ok with the wording, if this is
> something we want to do.
>
> [snip]
>
>  > > So we have to be careful here - we are (slightly) weakening
>  > > the strength of the Challenge mechanism if we allow a whole
>  > > range of consecutive challenges to be valid at the FA
> instead  > > of just one.  How do we choose the proper window
> size?  Is it
>  > > possible to characterize the resulting strength/weakness of
>  > > the challenge mechanism?
>  >
>  > [AM]
>  > If that is TRUE, I believe the Timestamp replay protection
> mechanism between  > MN and HA suffers the same weakness. I
> do not think that we are weakening  > the strength of
> challenge mechanism here.  > I think we either use a window
> of challenges or always previous challenge  > +/- 1.  > I do
> not think defining the window is a big problem.
>
> Maybe not.  I'd still like a review by security folks before
> we do something like this.

[AM]
I have no problem with this.
>
>  > >  > > If, as you point out, the forward channel towards the MN is
>  > >  > > congested, and it is impossible to guarantee timely
> delivery
>  > >  > > of the RRP, the MN may not receive the RRP before it
>  > >  > > increments the ID field (note that only the Timestamp based
>  > >  > > replay protection causes the ID field to increment on a
>  > >  > > retransmission) and retransmits the request. 
>  > >  > > However, I fail to see how your changes fix this
> problem.  It
>  > >  > > seems like even if we had a way to indicate the
> presence of a
>  > >  > > retransmission to the FA, the circumstances that
> you outline
>  > >  > > could still take place. In fact, RRPs may be
> dropped entirely
>  > >  > > on the reverse link due to congestion, not just
> arrive late. 
>  > >  > > It is true that the challenge response mechanism
> introduces a
>  > >  > > new failure case (STALE_CHALLENGE), but this seems
>  > > incidental to me.  >
>  > >  > [AM]
>  > >  > You are highlighting two issues here.
>  > >  > 1. The channel is totally blocked (???) and all RRPs will
>  > > be dropped in the  > way to the MN. This scenario no one has
>  > > a solution for and we should not  > worry about fixing it.  >
>  > >  > 2. The second issue is the one I am trying to solve. The
>  > > incidental case  > when the channel is congested and the RRP
>  > > taking more time than expected to  > arrive at the MN or the
>  > > channel is congested that some of the RRP is dropped  > BUT
>  > > NOT every RRP is dropped.  >
>  > >  > My solution to this problem is to prevent Stale-Challenge
>  > > DOS by allowing  > the MN to indicate to the FA that it is a
>  > > retransmit and the FA will accept  > the RRQ and relay it to
>  > > the HA. By NOT blocking the retransmit RRQ at the FA  > due
>  > > to STALE_CHALLENGE, a successful RRP with code (00) is always
>  > > sent to  > the MN. This gives the MN a chance to receive one
>  > > successful RRP message  > before MN hits its retry timer
>  > > (retransmit timer).  >  > i.e. In the example I outlined
>  > > earlier, MN will receive RRP (ID4)  > "successful" while it
>  > > is tracking ID4. This makes the MN happy and  >
>  > > Re-Registration will complete successfully.
>  > >
>  > > But, what if the RRP (ID4) is delayed until the MN
>  > > retransmits again? Really, the root cause of this problem is
>  > > that the MN didn't wait long enough for the first response.  >
>  > [AM]
>  > Eventually, the MN will receive a successful RRP message
> with an ID that the  > MN is tracking.
>
> Why is that?  Couldn't the response to ID4 be delayed
> arbitrarily until the MN decides to retransmit?

[AM]
I am sorry, I do not understand what you mean. Please elaborate.
>
>  > >  > > Rather, I think we can only fall back on this text which is
>  > >  > > in RFC 3344:
>  > >  > >
>  > >  > >    The maximum time until a new Registration Request is
>  > > sent SHOULD be
>  > >  > >    no greater than the requested Lifetime of the
>  > > Registration Request.
>  > >  > >    The minimum value SHOULD be large enough to account
>  > > for the size of
>  > >  > >    the messages, twice the round trip time for
>  > > transmission to the home
>  > >  > >    agent, and at least an additional 100 milliseconds to
>  > > allow for
>  > >  > >    processing the messages before responding.  The round
>  > > trip time for
>  > >  > >    transmission to the home agent will be at least as
>  > > large as the time
>  > >  > >    required to transmit the messages at the link speed
>  > > of the mobile
>  > >  > >    node's current point of attachment.  Some circuits
>  > > add another 200
>  > >  > >    milliseconds of satellite delay in the total round
>  > > trip time to the
>  > >  > >    home agent.  The minimum time between Registration
>  > > Requests MUST NOT
>  > >  > >    be less than 1 second.  Each successive
>  > > retransmission timeout period
>  > >  > >    SHOULD be at least twice the previous period, as long
>  > > as that is less
>  > >  > >    than the maximum as specified above.
>  > >  > >
>  > >  > > So, if you know you are on a link with a lot of buffering,
>  > >  > > you should increase the minimum interval between
>  > >  > > retransmissions to take into account the worst-case round
>  > >  > > trip time.  This will ensure that, if there is any response
>  > >  > > to your request in the queue, you will get it prior to
>  > > retransmitting.  >
>  > >  > [AM]
>  > >  > I think the process you just highlighted is to provide the
>  > > MN with a dynamic  > mechanism to detect the state of the
>  > > link. I believe it is very difficult to  > statically predict
>  > > the sate of the link that the MN is attached to,  >
>  > > especially with inter-technology handoff, etc...
>  > >
>  > > Well, I think you have to have some estimate of the RTT of
>  > > the link when choosing this parameter.  If you are running
>  > > over a cdma2000 link you probably should not set the initial
>  > > retransmit interval smaller than 4 seconds.
>  > >
>  > [AM]
>  > I do not think that this will solve the problem.
>  > RFC3344 offers this mechanism which I think is a valid one
> and MNs already  > use this mechanism in the field.
>
> What do you mean here by "this mechanism"?  Are you talking
> about the choice of a timer value to use before retransmission?

[AM]
I mean the simple algorithm that RFC3344 suggested for the MN to use for retransmission.
>
> I think choosing a value of 4 seconds, when running over a
> cdma2000 link, will allow a queued RRQ time to arrive at the MN.

[AM]
Are you saying that you can always predict the delay of the network in general when running over a cdma2000 link and 4 seconds will always work.

>
> If the RRQ is lost, then yes, we will need to go through the
> STALE_CHALLENGE error to get a new challenge.  This may
> extend the time to complete the registration.  However, as
> long as one round trip
> (MN-FA-HA-FA-MN) eventually completes successfully, the MN
> will get registered.

[AM]
"Eventually" is limited here. Because the MN will continue to try until registration lifetime expires.
>
>  > The issue here is that this DOS happens
>  > while customer doing serious activity. I understand that
> this issue becomes  > minor if it happens during initial registration.
>
> I'm not sure I see the difference in seriousness, but I'll
> take your word for it.  Anyway, it seems to me that choosing
> a longer timer fixes the deadlock problem you have pointed out.
>

[AM]
I am not sure if that an optimal solution.  Let us say MN is running over cdma2000 link and a MN vendor sets the retransmit initial time to 10 seconds (just to make sure that RRP is guaranteed to arrive before retransmit). The problem here, we are recommending a standard around an exception. In normal case, service provider do not want the MN to wait for 10 seconds reserving resources before retransmits for the obvious reason of lost RRQ. 

>
>  > >  > > The basic Mobile IP message flow requires a complete
>  > >  > > round-trip between MN & HA, where no link along the path
>  > >  > > drops the message. Retransmission is performed
> end-to-end.  I
>  > >  > > don't see how or why we should change this.
>  > >  > >
>  > >  > [AM]
>  > >  > Unfortunately, RFC3012bis broke the mechanism outlined
>  > > above in section  > 3.6.3 (RFC3344) by the stale-challenge
>  > > mechanism.  > This solution is needed to enable the MN to
>  > > make use of the above mechanism.
>  > >
>  > > I don't think the retransmit mechanism is broken, you just
>  > > need to choose the right interval for the type of link
> you are on.  > >
>  > [AM]
>  > I disagree. Are you suggesting that the mechanism offered
> in RRC3344 is not  > needed.  > Only specifying the right
> interval will solve the problem!
>
> What "mechanism offered in RFC3344" are you referring to?

[AM]
It is the very basic algorithm suggested by RFC3344. MN retransmit after t, then 2t, 4t, ...
>
>  > > You have also pointed out that receipt of a STALE_CHALLENGE
>  > > is really at best an ambiguous indicator to the MN of whether
>  > > or not it is registered.  In the situation you outlined, the
>  > > FA has a valid registration renewal for the MN but the MN
>  > > doesn't know it.  Perhaps we should state that the MN should
>  > > not disconnect itself merely because its old Lifetime
>  > > expired, when it receives one or more STALE_CHALLENGE results
>  > > from its attempts to register in the meantime.  That is, if
>  > > the MN can continue to send and receive packets on the link,
>  > > it is possible to keep going even though the MN's local
>  > > estimate of the Lifetime has expired.
>  > >
>  >
>  > Are you serious here?
>
> Absolutely - I didn't see anything in RFC 3344 that would
> require the MN to stop sending/receiving packets when its
> local estimate of the Lifetime expires.  Do you see something
> there that would require this? Could you point it out?

[AM]
You are correct. I did not see anything there.
I thought that the MN is authorized to use its data session after successful exchange of authentic control messages. It seems to me that you are suggesting the opposite.

Am I reading you correctly here? As long as MN can transmit data, It is guaranteed service with its HA and probably does not need to exchange authentic control messages. Is MN allowed to update its registration lifetime in this case?

 

>
> Can anyone else speak up if there is really a known deadlock
> problem here that we need to address?
>
> -Pete
>
>
>
>


Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.