Re: [Roll] RPL Design Questions
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Roll] RPL Design Questions



Hi:

I think Tim did a great job at keeping the question text simple.
Thinking about his own answer on this, one might wonder what the actual
cost of things is and how things are related.

The good news is that as far as I can see, the 3 items of the poll are
not related:

- one could bypass the DAG hop Timer mechanism and still allow the
split/merge mechanism that is used for loop avoidance. The result will
probably be more runtime churn and more battery consumption.
- Alternatively one could ignore the split/merge and stay there, simply
advertising infinity to poison the broken path, and yet, the DAG hop
timer would be useful to avoid churn as the new DAG is constructed to
restore connectivity.
- and the hold down is orthogonal, rather related of the validation
process that takes place before a parent can be used at all.

Even better, it appears that a network that incorporates mixed behaviors
still works correctly. If I'm correct on that, then it could be possible
to make those things optional, like having a fully capable node and a
more constrained node's behavior, and then enable a policies in the
former to more or less act as the latter.

The other aspect of the question is the real cost associated to the
listed items.

Question A: 
------------
Feature benefit:
When this mechanism is in place, a DAG can detach and still enable local
(P2P) connectivity. There are use cases for that, though not really
obvious in our requirement drafts. So obviously we'd be losing that sort
of connectivity if the mechanism

Implementation cost:
When a node detaches, it should in theory keep some history of its
previous DAGs as long as those DAGs live, so as to never jump back in
the same DAG, same sequence but at a higher rank. That would be perfect
loop avoidance. But that's not what the spec does. The spec attempts to
mark the wrong landing sites by route poisoning and then allows any jump
after a reasonable time. So the implementation cost at the end of the
day is one timer, and a bit of logic to decide whether that timer should
be armed at all. The current text allows overloading the DAG Hop Timer
for that purpose (that's Q 2).

Question B:
Feature benefit:
This is all about churn. There are occasions like a new sequence or a
DAG accident (new root inserted or removed, split/merge...) where a DAG
and sometimes neighboring nodes are impacted. If we leave it to chance,
nodes will reposition a number of times before things settle, causing
churn and battery depletion. This timer delays things and forces the
changes to spread like a wave in one pass from the root out.

Implementation cost:
In theory that's one timer per parent that this node wants to jump to.
In terms of implementation, this can be reduced to one timer per DAG
that the node might want to jump to, and even further reduced to one
timer at all if the implementation only keeps track of the most
preferred DAG onto which it could jump. Note that the doc allows the
node to use the same timer for Q A). The feature also requires to pass
the DAG base period in the DIO.

Question C) 
Feature benefit:
A node that has poor history is blacklisted. This prevents it to be
reconsidered as a parent for a configurable period of time. This method
avoid periodic churn and black out, or at least makes that period a lot
longer.

Implementation cost:
In theory one timer per blacklisted node, and some data structure per
blacklisted node. In practice, an implementation can use a single
periodic timer and a list of the blacklisted nodes. It can for instance
maintain a counter per listed node that is incremented at each tick.
When the increment reaches a threshold, the node is removed from the
blasklist.


My vote is YES to all the questions. Keep them in the base spec. 
And allow for a fallback behavior in the real constrained devices if
that can cohabit as I think it does.

Pascal

>-----Original Message-----
>From: roll-bounces at ietf.org [mailto:roll-bounces at ietf.org] On Behalf Of
Tim Winter
>Sent: mardi 13 octobre 2009 05:16
>To: ROLL WG
>Subject: [Roll] RPL Design Questions
>
>WG,
>
>Please do let us know what you are thinking w/ regard to the design
>Questions A, B, & C below.  We hope to include your feedback into the
next
>revision of the document, with FSM descriptions as well.
>
>Thanks,
>
>- RPL Authors
>
>
>
>   As discussed at the interim meeting, implementer feedback has
>   indicated that the RPL protocol as proposed appears complex, both in
>   description and actual FSM.
>
>   Our plan of action is to, by end of month, deliver a -04 that will
>   contain actual (non-editorial) changes.  The intent of this mail is
>   a poll over a period of 2 weeks, to help design the functional
>   changes.
>
>   In this spirit, we would like the WG to work on the following
>   questions:
>
>   Question A:  Should RPL make use of the currently proposed local
>      repair mechanism (DAG splitting and merging)?
>
>         [NO implies that DAG repairs shall be coordinated globally
with
>         the use of DAG Sequence Number; the related mechanisms are to
>         be expanded for -04]
>
>   Question B:  Should RPL make use of a hold-up timer and related
>      states/procedures to reduce churn by coordinating the DAG merge?
>
>         [NO implies RPL allows nodes to move (jump) between DAGs with
>         little coordination to reduce complexity of specification/
>         implementation (perhaps w/ Optional hold-up mechanism)].
>
>   Question C:  Should RPL make use of a hold-down timer and related
>      states/procedures to limit flapping when removing DAG Parents /
>      leaving DAGs
>
>         [NO implies RPL allows nodes to freely remove/add DAG Parents
>         as and when they are able in order to reduce complexity of
>         specification/implementation (perhaps w/ Optional hold-down
>         mechanism).
>
>   WG, we expect that this thread will contain a lot of interesting
>   related discussion, but in your comments please do also be sure to
>   try and address the initial questions A, B, and C so as to help us
>   better capture the WG position on these issues.  Please note that A,
>   B, and C are to some degree orthogonal, e.g. `No Local Repair' to
>   Question A does not necessarily imply a particular disposition
toward
>   the Hold-Up mechanism in Question B.
>
>   Thanks,
>
>   RPL Authors
>
>
>
>   Some examples, derived from the present draft, are provided below:
>
>
------------------------------------------------------------------------
>
>      Example: Local Repair with Hold-Up state
>
>
>          :                      :                      :
>          :                      :                      :
>         (A)                    (A)                    (A)
>          |\                     |                      |
>          | `-----.              |                      |
>          |        \             |                      |
>         (B)       (C)          (B)       (C)          (B)
>                    |                      |             \
>                    |                      |              `-----.
>                    |                      |                     \
>                   (D)                    (D)                    (C)
>                                                                  |
>                                                                  |
>                                                                  |
>                                                                 (D)
>
>              -1-                    -2-                    -3-
>
>
>                         Figure 1: DAG Maintenance
>
>   Consider the example depicted in Figure 1-1.  In this example, Node
>   (A) is attached to a DAG at some rank d.  Node (A) is a DAG parent
of
>   Nodes (B) and (C).  Node (C) is a DAG parent of Node (D).  There is
>   also an undirected sibling link between Nodes (B) and (C).
>
>   In this example, Node (C) may safely forward to Node (A) without
>   creating a loop.  Node (C) may not safely forward to Node (D),
>   contained within it's own sub-DAG, without creating a loop.  Node
(C)
>   may forward to Node (B) in some cases, e.g. the link (C)->(A) is
>   temporarily unavailable, but with some chance of creating a loop
>   (e.g. if multiple nodes in a set of siblings start forwarding
>   `sideways' in a cycle) and requiring the intervention of additional
>   mechanisms to detect and break the loop.
>
>   Consider the case where Node (C) hears a RA-DIO message from a Node
>   (Z) at a lesser rank and superior position in the DAG than node (A).
>   Node (C) may safely undergo the process to evict node (A) from its
>   DAG parent set and attach directly to Node (Z) without creating a
>   loop, because its rank will decrease.
>
>   Now consider the case where the link (C)->(A) becomes nonviable, and
>   node (C) must move to a deeper rank within the DAG:
>
>   o  Node (C) must first detach from the DAG by removing Node (A) from
>      its DAG parent set, leaving an empty DAG parent set.  Node (C)
>      becomes the root of its own floating, less preferred, DAG.
>
>   o  Node (D), hearing a modified RA-DIO message from Node (C),
follows
>      Node (C) into the floating DAG.  This is depicted in Figure 1-2.
>      In general, any node with no other options in the sub-DAG of Node
>      (C) will follow Node (C) into the floating DAG, maintaining the
>      structure of the sub-DAG.
>
>   o  Node (C) hears a RA-DIO message from Node (B) and determines it
is
>      able to rejoin the grounded DAG by reattaching at a deeper rank
to
>      Node (B).  Node (C) starts a DAG Hop timer (putting Node (B) in
>      the Hold-Up state) to coordinate this move.
>
>   o  The timer expires and Node (C) adds Node (B) to its DAG parent
>      set.  Node (C) has now safely moved deeper within the grounded
DAG
>      without creating any loops.  Node (D), and any other sub-DAG of
>      Node (C), will hear the modified RA-DIO message sourced from Node
>      (C) and follow Node (C) in a coordinated manner to reattach to
the
>      grounded DAG.  The final DAG is depicted in Figure 1-3
>
>
>
------------------------------------------------------------------------
>
>      Example: Global Repair with Sequence Number / No Held-Up State
>
>
>          :                      :                      :
>          :                      :                      :
>         (A)                    (A)                    (A)
>          |\                     |                      |
>          | `-----.              |                      |
>          |        \             |                      |
>         (B)       (C)          (B)       (C)          (B)
>                    |                                    \
>                    |                                     `-----.
>                    |                                            \
>                   (D)                    (D)                    (C)
>
>
>
>                                                                 (D)
>
>              -1-                    -2-                    -3-
>
>
>                         Figure 2: DAG Maintenance
>
>   Consider the example depicted in Figure 2-1, similar to that
depicted
>   in Figure 1.  Let there be a sequence number associated with the
DAG,
>   as originated from the DAG root, with value N. Initially all nodes
>   have received DAG Sequence Number N. The following example offers
one
>   possible Global Repair scenario to give a high level idea, please
>   note that there are details that would remain to be worked out if
the
>   WG heads in this direction.
>
>   Now consider the case where the link (C)->(A) becomes nonviable, and
>   node (C) must move to a deeper rank within the DAG:
>
>   o  Node (C) must first detach from the DAG by removing Node (A) from
>      its DAG parent set, leaving an empty DAG parent set.  At this
>      point Node (C) is not associated with any DAG [TBD -- Alternately
>      Node (C) remains associated with the DAG at rank infinity].
>
>   o  Node (D) may possibly learn that Node (C) is no longer associated
>      with any DAG and itself becomes unassociated [TBD Node (D) may
>      also retain a sub-DAG relationship with Node (C)].  Let Node (C)
>      and (D) both be free from any DAG, but remember the Sequence
>      Number N associated with their original DAG.  This is depicted in
>      Figure 2-2.
>
>   o  Node (C) and (D) will now be willing to reattach to the original
>      DAG at IMMEDIATELY (no hold-up mechanism) at ANY node offering
>      connectivity to the original DAG and advertising a sequence
number
>      N+1.  Since Node (C) and (D) are no longer members of the
original
>      DAG, only a node who is still a member of the original DAG may
>      possibly present the sequence number N+1.  Such a node who
>      presents N+1 must clearly have another path to the DAG root other
>      than via (C) and (D) and thus may offer a loop-avoiding
attachment
>      point.
>
>   o  [TBD] The DAG Root may periodically issue sequence number
>      increments, causing the issue of N+1
>
>   o  [TBD] The broken link (C)->(A) may be detected through some other
>      means, and signalled to or cause a trigger at the DAG Root,
>      causing the issue of N+1
>
>   o  Let Node (C) hears a RA-DIO message with N+1 from Node (B) and
>      determine it is able to rejoin the original DAG by immediately
>      reattaching j to Node (B) (No hold-up state in this example.  The
>      DAG is now as depicted in Figure 2-3
>
>   o  Node (C) may then subsequently send RA-DIO with DAG Sequence
>      number N+1, allowing Node (D) to reattach (not depicted).
>
>
------------------------------------------------------------------------
>
>_______________________________________________
>Roll mailing list
>Roll at ietf.org
>https://www.ietf.org/mailman/listinfo/roll
>
>_______________________________________________
>Roll mailing list
>Roll at ietf.org
>https://www.ietf.org/mailman/listinfo/roll

Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.