[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Sip] Good Hash: draft-ietf-sip-fork-loop-fix-03
----- From: "Robert Sparks" <rjsparks at nostrum.com>
That may be exactly why call-id and cseq were included in the hash in
3261. Assuming that's so, then
all we need are bits that change between transactions. The first part of
the via branch parameter would
be enough.
So, I propose I add text explaining why you SHOULD (do we make this
MUST?) add some set of bits that
change between requests, and note that either the first part of the via
branch or the combination of call-id
and cseq would be good choices.
That's quite cunning Jeroen! I think it can only be SHOULD, because the
whole thing is SHOULD and RECOMMENDED etc.
One issue is that this works well for the cryptographic hashes, but I don't
think it will work for the linear hash I mentioned in an earlier e-mail,
i.e.:
for( i=0; i<strlen(data); ++i )
hash = hash * MULTIPLIER + data[i];
where MULTIPLIER is 31 or 37.
That's because if:
HASH( C1 . Ra ) == HASH( C1 . Rb ) then
most likely HASH( C2 . Ra ) == HASH( C2 . Rb )
where:
C1 is the fixed data specific to Call 1, e.g. Call-Id, and
C2 is similar for the second call, and
Ra is the routing input for the first time through proxy and
Rb is the routing data for the second time through the proxy and
. is some form of concatenation operator.
Because
HASH( C1 . Ra ) can be converted into HASH( C1 . ? ) + HASH( Ra ) and
then HASH( C1 . ? ) can be removed from the equation in each case.
According to some site on the internet [1] (so it must be true), Perl's hash
function is:
x[] = { 61,59,53,47,43,41,37,31,29,23,17,13,11,7,3,1 };
for( i=0; i<strlen(data); ++i )
hash = hash * 5 + data[i] * x[i%16];
This is still fairly linear despite x[], so doesn't fix the problem above.
But if you had:
len_call_id = strlen( Call-Id );
for( i=0; i<strlen(data); ++i )
hash = hash * 5 + data[i] * Call-Id[i % len_call_id];
or even:
len_call_id = strlen( Call-Id );
for( i=0; i<strlen(data); ++i )
hash = hash * 5 + data[i] ^ Call-Id[i % len_call_id];
I think that's sufficient non-linearity such that:
If HASH( C1 . Ra ) == HASH( C1 . Rb ) then
there's a good chance HASH( C2 . Ra ) != HASH( C2 . Rb )
If that's the case, this would be considerably more efficient than an MD5
hash. Trouble is I have no proof of my assertion about the non-linearity
effect, so it would seem difficult to include something like the latter hash
function in the draft. What I have proven is that it would be a lot of
waffle to discuss such issues in the draft!
So on the one hand the draft can recommend an MD-5 hash, but I think is
unnecessarily computationally demanding for the task, or it can waffle about
other issues and allude to simpler hashes but not have a very good
mathematical basis.
Thus I'm not really sure what to suggest! Maybe the text in the core of the
draft could remain the same, but an informal appendix could include
discussion such as above!
Hope that helps !!!!
Pete.
[1]
http://www.cs.duke.edu/~mlittman/courses/Archive/cps130-97/lectures/lect11/node25.html
On Sep 14, 2006, at 5:20 AM, Jeroen Van Bemmel wrote:
I suppose one issue is that if you do happen to be the 20 billionth
person
trying to make such a call, then your call will fail every time, which
may
be a bad effect.
(jvb) And THAT'S the reason to include the call-id in the hash: to
ensure that an unlucky caller does not suffer the same fate twice (or
rather: to make it very unlikely)
Jeroen
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
for XML to C++ data binding visit
http://www.tech-know-ware.com/lmx
(or http://www.xml2cpp.com)
=============================================
_______________________________________________
Sip mailing list https://www1.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol
Use sip-implementors at cs.columbia.edu for questions on current sip
Use sipping at ietf.org for new developments on the application of sip