[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [dhcwg] max-unacked-bndupd



On Thu, Aug 17, 2006 at 11:33:09AM -0700, Damien Neil wrote:
> I'm also curious as to the reason for the focus on ensuring that  
> CONTACT messages are processed in a timely fashion.  In the case  
> where the failover connection is clogged with BNDUPD messages,  
> there's no need to send CONTACTs--since the tSend timer will be reset  
> every time a BNDUPD message is sent, and the tReceive timer will be  
> reset every time a BNDUPD message is read.  CONTACT messages are only  

I believe the timers would be reset for any message, not specific
ones (don't know if you meant that to be specific or not):

   The tReceive timer is reset whenever a message is received from this
   TCP connection.  If it ever expires, the TCP connection is dropped
   and communications with this partner is considered not ok.  The
   reject reason 17: "No traffic within sufficient time" is placed in
   the DISCONNECT message sent prior to dropping the TCP connection.

   The tSend timer is reset whenever a message is sent over this connec-
   tion. When it expires, a CONTACT message MUST be sent.

So, any message, even ACKs if the dialogue is rather uni-directional.

> useful in circumstances when there is no traffic passing between the  
> peers.

Well, only when no traffic is actively passing, yes.

Imagine a system has 10 buckets for update messages internally,
no matter how big the tcp buffer sizes are.  It fills all 10,
but none of them are being processed - a lock on the database
has held up processing, let's imagine, so none of them can complete.

The server isn't down, it can still be answering DHCP clients (say
from a memory cache of said database) - it just has...committment
problems.

If you queue 11 updates, then a contact, it will block (or get dropped,
or reset the connection, or who knows) and you may (incorrectly) determine
that the server is down.  Hopefully, however, it just blocks - the
server should have 11 buckets where the 11th is the place it reads
data off the TCP stream into.  Hopefully it just stops reading until
a slot opens up.


So I think the text is correct: it has more to do with the remote
system's processing (eg database) blocking than TCP blocking, but the
result, and what's trying to be avoided, is TCP blocking.

The remote system may still be unable to process the CONTACT message,
eg because it is single-threaded and stuck on that database lock,
and you'll switch to comms-interrupted most likely.  But it won't be
because of the TCP channel.

-- 
David W. Hankins	"If you don't do it right the first time,
Software Engineer		you'll just have to do it again."
Internet Systems Consortium, Inc.	-- Jack T. Hankins

Attachment: pgpelextdUrbR.pgp
Description: PGP signature

_______________________________________________
dhcwg mailing list
dhcwg at ietf.org
https://www1.ietf.org/mailman/listinfo/dhcwg