Hello,

I have been selected as the Routing Directorate reviewer for this draft. The
Routing Directorate seeks to review all routing or routing-related drafts as
they pass through IETF last call and IESG review, and sometimes on special
request. The purpose of the review is to provide assistance to the Routing
ADs. For more information about the Routing Directorate, please see


http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir


Although these comments are primarily for the use of the Routing ADs, it
would be helpful if you could consider them along with any other IETF Last
Call comments that you receive, and strive to resolve them through
discussion or by updating the draft.

Document:

draft-ietf-roll-security-threats-09

Reviewer: Manav Bhatia
Review Date: 25-08-14
IETF LC End Date: 28-08-14
Intended Status: Informational

Summary:

I have some concerns about this document and recommend that the Routing ADs
discuss these issues further with the authors.

Comments:

This document presents a threat analysis for securing the routing protocols
used in LLNs which are more prone to packet losses and other
vulnerabilities. It then discusses how those attacks/threats can be
mitigated.

Major Issues:

(1)  Sec 7.2.2.  "Countering Overclaiming and Misclaiming Attacks" says that
" counter to overclaiming and misclaiming may employ: comparison with
historical routing/topology data;"

I don't understand how this will work. Assume node A was always a leaf node
and was never attached to any routing device. Later, a new router gets
attached to this node. The node A now starts advertising this new router.
How will the peers know whether it's a legitimate advertisement or an
"overclaim" from node A? I don't agree with notion that somebody could look
at the historical data to figure out if a router is over/under claiming.

I also don't think it's a good idea to suggest that we "restrict realizable
network topologies" to counter overclaim/misclaim attacks.

(2) Sec 7.3.2 "Countering Overload Attacks" suggests that "overload attacks"
can be countered by "isolate nodes which send traffic above a certain
threshold based on system operation characteristics;". Has this been
adequately discussed in the WG? It's certainly possible that in an event of
reroute/reconvergence a certain node can momentarily get more traffic than
the threshold. Ordinarily, such nodes only drop the additional traffic.
However, this document recommends "isolating" such nodes. Doing this imo can
actually worsen the problem and will have a cascading effect where a
different node now becomes overloaded. I would suggest that a better way to
avoid the overload attack is by treating such routers/nodes as leaf nodes in
their graph. This would mean that we use this router/node for only reaching
the directly connected devices and is never placed on the transit path to
reach other nodes/routers.

Minor Issues:

(1) A common threat in routing protocols is that there is some unauthorized
entity that somehow manages to gain access to keying material. Using this
material, the attacker can send packets that pass the authenticity checks
based on Message Authentication Codes (MACs). The most obvious way to
mitigate such threat is by periodically changing the keys currently in use
by the legitimate routing peers. Hence routing protocols designed for LLN
must provide provision to easily change their keying material. Additionally
security mechanisms designed for RPL must be such that the operators can
quickly change keys without disrupting the routing system (data loss) and
with minimal operational overhead/expense. If there is a significant
overhead than this can lead to the keys not being changed often enough.  I
don't think this aspect is covered in the document.

(2) Nodes in this environment can be installed but not switched on for quite
some time. How would such devices get the updated keying material if it has
changed by the time these get turned on? On a related note, what happens if
these nodes support an authentication/encryption algorithm that gets
deprecated by the time these nodes get switched on -- can happen if this
particular node lies dormant for a few years before its turned on (refer to
sec 4.3)

(3) An automated key management protocol is a very important component of
the security framework.  Do all nodes in the LLN need to be manually
configured? This may not be possible if there are large number of nodes in
the routing domain. Wouldn't it make sense to actually discuss this

(4) Sec 4.4. "RPL Security Objectives". Are we interested in message
contents validation? Lest there is any doubt let me clarify that I am not
talking about message integrity. I am talking about ensuring that a claim
made in the route advertisement is indeed correct before accepting it (like
SIDR).  If we're not doing this, then it might help to explicitly state that
message content validation is out of scope. If you're adding this, then a
note or two on why would be extremely helpful.

(5) Sec 4.4 says " Hence, routing in LLNs needs to bootstrap the
authentication process and allow for flexible expiration scheme of
authentication credentials."

Can you explain this a bit more? Will this not lead to a circular dependency
where routing bootstraps security, but you need security already in place
for routing to work (to ensure we're speaking to an authorized node)?

(6) Sec 7.3.3 "Countering Selective Forwarding Attacks". Are you really
suggesting that RPL should redundantly flood protocol messages over multiple
paths in the hope that at least one will make it to the destination. Given
the delicate energy and network utilization constraints this just doesn't
look right. Shouldn't we focus more on ensuring that we don't get an insider
malicious node than on what we can do once we have one inside our routing
domain?

(7) Sec 7.3.3 suggests "dynamically selecting the next hop from a set of
candidates." to counter selective forwarding attacks. I am not sure i
completely understand the point. Are you suggesting that we should consider
multiple paths when constructing the shortest path tree? This will only work
when you have ECMP because its only then that you have a set of nexthops
that you can use without affecting the total cost to reach the destination.
Without ECMP, how can you have a set of nexthops to choose from? I don't
think you're alluding to pinning the path here?

(8)7.3.4 "Countering Sinkhole Attacks" suggests "isolate nodes which receive
traffic above a certain threshold;". I disagree doing this without further
qualifying on what's meant by a "certain threshold" for the same reason as i
cited above in (4).

Nits:

(1) There are several instances where RFC 2119 keywords are used. While I
personally don't have an issue with using those keywords in an Informational
draft, I was grilled in the IESG review on this and went through the pain of
removing those at the last minute.
 
(2) Sec 6 describes the threats and the possible attacks on RPL. In this
context the title of Sec 6.1 is very clear and the reader understands the
threat/attack being discussed in the following subsection. However the title
of Sec 6.2 is very vague. I had to read the entire subsection to understand
what the subsection was about.  I think what you want to cover are the
threats you get exposed to if your system *lacks* confidentiality. 

Cheers, Manav