[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments on draft-spagnolo-manet-ospf-design-00.txt



Tony and all,

Thanks for your comments.  Please see my response to
two of your comments below.


Tony Przygienda wrote:



Richard, Gary, Russ,

chewed my way through the recent threads and simulation results and here
are my 'meta'-comments.
From simple to hard:

a) The simulation is considering relatively-low-rate-of-neighbor-change
scenarios to jump to
the conclusion that incremental-hellos are of little value (as has been
observed by other people).
On one hand, I agree obviously with the argument that with higher
density of nodes can cause
much higher rates of neighbor change as well as neighbor numbers, on the
other hand, the
complexity of the proposals is low enough to make the incremental-hellos
a belts-and-suspenders
optimization kind of thing, even if the link bandwidth overhead will not
be staggering.
Finally, I didn't drill through the proof of TBRPF to have an opinion as
to the merits
of either proposal but Russ's stuff looks simple enough ;-)


Actually, the main problem with Cisco's Hello protocol is the
additional overhead due to the unicast Hello requests, and the unicast
replies to those requests. In dense, highly mobile networks, this can
generate a lot of additional packets (as I discussed in a previous
message) and can actually generate more overhead than full Hellos.
(In highly mobile networks, it is desirable to use a smaller Hello
Interval for faster response to topology changes, so it is important
to minimize the size of Hellos in such networks.)

The incremental Hello protocol of TBRPF does not generate any packets
in addition to the one Hello per Hello Interval, but reports each
neighbor change in at most 3 (or k) consecutive Hellos.

To see how much overhead can be reduced by using incremental
Hellos versus full Hellos, I ran some ns-2 simulations.
I compared the incremental Hello protocol of TBRPF (RFC 3684)
to a modification of this protocol which uses full Hellos
similar to OSPF (each seen neighbor is included in each Hello).
Both protocols were run over IP over IEEE 802.11 with rate 2 Mbps.

The simulation model used 100 and 200 nodes in a 670x670 meter square,
with a transmission radius of 250 meters. Nodes moved according to the
the random waypoint model with maximum node speeds from 5 ms to 30 m/s
and a pause time of 0. The Hello Interval was 1 second (reasonable
for high mobility). The simulation was run for 500 seconds for
100 nodes and 200 seconds for 200 nodes. The results below show the
overhead in kbps for Hello packets:

           Max speed  Full Hellos   Incremental Hellos  % reduction
           ---------  -----------   ------------------  -----------
100 nodes      5 m/s    201.9 kbps       70.0 kbps          65.3%
100 nodes     10 m/s    220.1 kbps       73.8 kbps          66.5%
100 nodes     20 m/s    206.4 kbps       79.2 kbps          61.6%
100 nodes     30 m/s    215.2 kbps       85.0 kbps          60.5%

200 nodes     20 m/s    708 kbps         198 kbps           72%

The results show that incremental Hellos reduced the overhead by about
72% for 200 nodes, and 60% to 65% for 100 nodes (depending on the node
speed). The additional overhead for LS updates was 58.7 kbps for
100 nodes and 20 m/s. (TBRPF uses an efficient method for disseminating
partial topology information which uses incremental LS updates that
share the same packets as Hellos.)


b) Obviously flooding is a more interesting topic and I liked to read all the interesting tossing of ideas. i) As first observation, the predicting of transitions between acknowledgable and non-to-be-acked LSAs will be hard (more in c) or rather prohibitive in terms of computing power if done well. ii) As second, having two different flooding mechanisms flipping will make something that is very fragile doubly so so it should be probably kept as a life-saving mechanism only. iii) Third, from my experience with OSPF & ISIS flooding implementation and deployment I observed that ISIS flooding tended to be somehow simpler to implement and debug and keep stable in the field. BDR did not buy much in practical deployment and the initial-TDBX was somehow faster but insignificantly so. My point is probably that if you find that unreliable flooding is in most cases a good or better solution, it may not be worth to build a hybrid or tweak reliable flooding much (albeit the protocol definitely very much deviates from OSPF then ;-) The cost is however that you have to shape the flooding on the link (the famous 33msec timer that can be deadly for a large-scale implementatoin if done naively) but that can be implemented using proper leaky-bucket techniques fairly cheaply. The argument about mobile-non-moving networks is strange to me (I mean the sensors network thing) since it seems to go into the direction of an orthogonal, low link quality-low bandwith routing rather than mobile routing (kind like ALSR). iv) Fourth, prunning of the flooding tree (MBR work, flooding spanning tree and such) is hard to get right and I don't know which of the ideas have most merit. I have to read and think more here. The MBR looks reasonable to me (albeit the algorithms to be run are a tad convoluted), I also always liked the flooding-spanning-tree idea with a token passed around to generate the spanning tree and am surprised nobody picked up on that.

This is similar to the CDS approach, since the non-leaf nodes of a spanning tree form a CDS. Instead of constructing a tree (which may depend on the global topology), in a mobile network it may be better to select the CDS nodes based on local topology. I.e., each node decides whether it is in the CDS based on 2-hop neighbor information.

However, I agree with you that a source-independent approach
is simpler than the source-dependent MPR approach.
Whether to use a (source-independent) CDS or (source-dependent)
MPRs is an issue that needs to discussed.

Richard


c) Finally, I personally think from having seen it in another context that the 'predicting future' to know when to ACK and when not is bound to either fail or end up in some pretty heavy math that cannot be run real-time. Either we're dealing with something that is unpredictable (in terms of self-similarity beta=0.5), kind like exponential distribution or otherwise we can think about stuff in terms of self-similar pattern (since I do not believe that a single correlation over a well-known time-scale can be observed). If we know the beta (or measure it, which is a tad expensive computationally), there are methods to detect the time-scale of trends that we encounter (and therefore predict the future but only in terms of absolute beta, you know that the trend will persist with some probability but you cannot know which direction it is heading) and based on that an educated guess could be taken. But this again takes some serious computing power which I cannot believe will pay in mobile nodes just to optimize flooding. And in case you can do all that, yes, the at-source-squelching approach works much better that receiver based for couple of reasons I won't dwell further into.

As a 'meta'-'meta' in connection to the group's meeting (sorry for this
time, I try to be in D.C.) I think even more
than before that the wireless extensions should be kept in a special
OSPF area. The ideas passed around are radical enough
to almost certainly make a general-interface-type-solution overburdened
and the mix of different interfaces in same
area very fragile deployment-wise.

thanks

-- tony