Hello, I have been selected as the Routing Directorate reviewer for this draft. The Routing Directorate seeks to review all routing or routing-related drafts as they pass through IETF last call and IESG review, and sometimes on special request. The purpose of the review is to provide assistance to the Routing ADs. For more information about the Routing Directorate, please see http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir Although these comments are primarily for the use of the Routing ADs, it would be helpful if you could consider them along with any other IETF Last Call comments that you receive, and strive to resolve them through discussion or by updating the draft. Document: draft-ietf-roll-security-threats-09 Reviewer: Manav Bhatia Review Date: 25-08-14 IETF LC End Date: 28-08-14 Intended Status: Informational Summary: I have some concerns about this document and recommend that the Routing ADs discuss these issues further with the authors. Comments: This document presents a threat analysis for securing the routing protocols used in LLNs which are more prone to packet losses and other vulnerabilities. It then discusses how those attacks/threats can be mitigated. Major Issues: (1) Sec 7.2.2. "Countering Overclaiming and Misclaiming Attacks" says that " counter to overclaiming and misclaiming may employ: comparison with historical routing/topology data;" I don't understand how this will work. Assume node A was always a leaf node and was never attached to any routing device. Later, a new router gets attached to this node. The node A now starts advertising this new router. How will the peers know whether it's a legitimate advertisement or an "overclaim" from node A? I don't agree with notion that somebody could look at the historical data to figure out if a router is over/under claiming. I also don't think it's a good idea to suggest that we "restrict realizable network topologies" to counter overclaim/misclaim attacks. (2) Sec 7.3.2 "Countering Overload Attacks" suggests that "overload attacks" can be countered by "isolate nodes which send traffic above a certain threshold based on system operation characteristics;". Has this been adequately discussed in the WG? It's certainly possible that in an event of reroute/reconvergence a certain node can momentarily get more traffic than the threshold. Ordinarily, such nodes only drop the additional traffic. However, this document recommends "isolating" such nodes. Doing this imo can actually worsen the problem and will have a cascading effect where a different node now becomes overloaded. I would suggest that a better way to avoid the overload attack is by treating such routers/nodes as leaf nodes in their graph. This would mean that we use this router/node for only reaching the directly connected devices and is never placed on the transit path to reach other nodes/routers. Minor Issues: (1) A common threat in routing protocols is that there is some unauthorized entity that somehow manages to gain access to keying material. Using this material, the attacker can send packets that pass the authenticity checks based on Message Authentication Codes (MACs). The most obvious way to mitigate such threat is by periodically changing the keys currently in use by the legitimate routing peers. Hence routing protocols designed for LLN must provide provision to easily change their keying material. Additionally security mechanisms designed for RPL must be such that the operators can quickly change keys without disrupting the routing system (data loss) and with minimal operational overhead/expense. If there is a significant overhead than this can lead to the keys not being changed often enough. I don't think this aspect is covered in the document. (2) Nodes in this environment can be installed but not switched on for quite some time. How would such devices get the updated keying material if it has changed by the time these get turned on? On a related note, what happens if these nodes support an authentication/encryption algorithm that gets deprecated by the time these nodes get switched on -- can happen if this particular node lies dormant for a few years before its turned on (refer to sec 4.3) (3) An automated key management protocol is a very important component of the security framework. Do all nodes in the LLN need to be manually configured? This may not be possible if there are large number of nodes in the routing domain. Wouldn't it make sense to actually discuss this (4) Sec 4.4. "RPL Security Objectives". Are we interested in message contents validation? Lest there is any doubt let me clarify that I am not talking about message integrity. I am talking about ensuring that a claim made in the route advertisement is indeed correct before accepting it (like SIDR). If we're not doing this, then it might help to explicitly state that message content validation is out of scope. If you're adding this, then a note or two on why would be extremely helpful. (5) Sec 4.4 says " Hence, routing in LLNs needs to bootstrap the authentication process and allow for flexible expiration scheme of authentication credentials." Can you explain this a bit more? Will this not lead to a circular dependency where routing bootstraps security, but you need security already in place for routing to work (to ensure we're speaking to an authorized node)? (6) Sec 7.3.3 "Countering Selective Forwarding Attacks". Are you really suggesting that RPL should redundantly flood protocol messages over multiple paths in the hope that at least one will make it to the destination. Given the delicate energy and network utilization constraints this just doesn't look right. Shouldn't we focus more on ensuring that we don't get an insider malicious node than on what we can do once we have one inside our routing domain? (7) Sec 7.3.3 suggests "dynamically selecting the next hop from a set of candidates." to counter selective forwarding attacks. I am not sure i completely understand the point. Are you suggesting that we should consider multiple paths when constructing the shortest path tree? This will only work when you have ECMP because its only then that you have a set of nexthops that you can use without affecting the total cost to reach the destination. Without ECMP, how can you have a set of nexthops to choose from? I don't think you're alluding to pinning the path here? (8)7.3.4 "Countering Sinkhole Attacks" suggests "isolate nodes which receive traffic above a certain threshold;". I disagree doing this without further qualifying on what's meant by a "certain threshold" for the same reason as i cited above in (4). Nits: (1) There are several instances where RFC 2119 keywords are used. While I personally don't have an issue with using those keywords in an Informational draft, I was grilled in the IESG review on this and went through the pain of removing those at the last minute. (2) Sec 6 describes the threats and the possible attacks on RPL. In this context the title of Sec 6.1 is very clear and the reader understands the threat/attack being discussed in the following subsection. However the title of Sec 6.2 is very vague. I had to read the entire subsection to understand what the subsection was about. I think what you want to cover are the threats you get exposed to if your system *lacks* confidentiality. Cheers, Manav