Minutes ROLL Working Group Meeting - IETF-71 Philadelphia - March 2008 ====================================================================== Thanks to Joanthan Hui and Ralph Droms for having taken the minutes. Agenda ====== 1) Agenda/admin (Chairs - 5mn) [5] 2) Charter Review and Milestones (Chairs - 10mn) [15] 3) Home Automation Routing Requirement in Low Power and Lossy Networks draft-brandt-roll-home-routing-reqs-00 (Anders - 20mn) [35] 4) Industrial Routing Requirements in Low Power and Lossy Networks draft-pister-rl2n-indus-routing-reqs-00 (Geoff - 20mn) [55] 5) Urban WSNs Routing Requirements in Low Power and Lossy Networks draft-dohler-rl2n-urban-routing-reqs-00 (Christian - 20mn) [75] 6) Overview of Existing Routing Protocols for Low Power and Lossy Networks draft-levis-roll-overview-protocols-00 (David - 25mn) [100] JP: JP Vasseur DC: David Culler GM: Geoff Muligan DO: Dave Oran DW: Dave Ward AB: Ander Brandt MJ: Mike St. Johns CJ: Christian Jacquenet MT: Matthew Thomas PD: Pieter De Mil IC: Ian Chakeres CP: Charles Perkins Meeting ======= JP: Introduction, welcome, ... for this first WG meeting. Review of the charter, Milestones of the WG. JP conducted charter review; emphasis on requirements documents are important as foundation for work in this WG Mike St. Johns: Looking at the charter, it seems a bit unconstrained, especially w.r.t. the number of nodes. Is is 15 nodes or 15 million nodes? JP: As you can see in the requirements documents, scalability is one of the main constraints that we will have to deal with. Not rare to have LLNs with tens of thousands of nodes. IC: Important to know the number the routers and the number of routers that are peers of each other. DC: Two important issues and should be addressed by the routing requirements, not sure they are addressed currently. 3) Home Automation Routing Requirement in Low Power and Lossy Networks draft-brandt-roll-home-routing-reqs-00 (Anders - 20mn) AB: Routing perspective, one of two ways to communicate: either devices communicate directly, or they communicate with a central server. Traffic patterns depend a lot on whether they use one or the other. AB: One special feature is groupcast, multicast not the solution because IP multicast requires subscription. Need to have the impression that all nodes react at the same time. AB: Mobile nodes (such as remote control) that requires routes to reconfigure. AB: Kinds of devices: always on (ideal for routing resources and distributed through the home); battery powered; remote controls that are battery powered, sleeping, and moving at the same times. Makes some interesting routing issues, where some devices can be used, others can't, and others only sometimes. DO: How much state is required for a sleeping node to maintain when sleeping? I think it should be called out in the requirements. RD: This state may vary between devices that are doing different things. They may vary a lot. Chandra: In order to satisfy compliance requirements, they have to be very quick. You need to know within a few hundred milliseconds that a node is dead. Otherwise how do you do security in homes? AB: You might have a little more slack than that. GM: Most smoke detectors in US are powered, not battery powered. So they are very good for infrastructure devices, so that is wrong. The other thing I missed in the draft is the use-cases for HVAC and in the US, that's probably the single largest opportunity for this. Security is then probably the next. It's cool to turn lights on and off, it's required to save energy and others. RD: Need to separate requirements from application, transport, and routing. Otherwise you get the cross-product. DO: Good idea to treat differently the nodes that know they have moved vs. nodes that don't know they have moved. Could optimize for some cases. DC: One issue that hasn't been raised yet is the issue of groupcast. DC: Let's save most of the questions until the end. AB: Need for multi-path routing for robustness (but I agree with comments on the list that we should not specify multi-path). Ability to locate a working path within 250 ms. Neighbor discovery must be smart or frequent. AB: Controllers may not keep any state about what groups they have joined. May not have state about what controllers are controlling them. So would very much like to avoid traditional multicast. RD: Might not want to say, can't use IP multicast. Just abstract it and say groupcast without maintaining state about group membership. MT: One comment is that the WG's focus may not be addressing all of the interest in this room, since we are focusing on a limited set of very specific applications. AB: How many nodes? Support 200+ nodes in a subnet. Not convinced that there will be 1000's of nodes. JP: One of the reasons to have a separate document for building automation. Scalability is very different. AB: Very different traffic patterns as well. Centralized control vs. direct device-to-device control. AB: Very unique area, very real applications today, agree that the draft today is not mature at all. MT: With different scenarios (building, pipeline, scalability) the solution may be very different the best for one may be the worst for another. Also what scale are we really looking at? Is it just 100's of nodes? 1000's of nodes? Chandra: Mentioned synchronization of controlling lamps. Is that command delivery synchronization? or command-response synchronization? AB: Depends a lot on what network resources you have. If you have 50 nodes and you're unicasting, you're probably going to see the popcorn effect. PD: If you want to do that, you need a positioning method. Maybe you can use time synchronization as well. AB: The nodes would then need to keep state about the time. It's unclear whether nodes actually maintain state about that time. 4) Industrial Routing Requirements in Low Power and Lossy Networks draft-pister-rl2n-indus-routing-reqs-00 (Geoff - 20mn) [55] JP: Kris couldn't make it. Geoff presenting in his place. Geoff: Summary requirements from ISA SP100 Top issues - reliability of data - monitoring application - 50% battery, 50% wired Traffic summaries: packet rates, data size, latency, asymmetric traffic patterns Routing requirements: number of nodes, topology, broadcast/multicast GM: ROLL is important to increase the number of sensors/amount of information being collected while limiting installation costs. GM: User requirements (from ISA100): reliability is critical (unclear whether the data is received or data that is received is correct). Bottom line is that devices will be used to collect information, less so for closed-loop control. GM: There are going to be battery-powered devices, but will be equal number of powered devices. Users are looking at a hybrid where large number of nodes are powered, with an equal number of battery-operated nodes. Idea is that there is a large core that is operating all the time. DC: Even if you're sitting on the end of a 4-20 mA loop, you don't get much power out if it. So it's still low power. GM: Right. GM: Different classes of applications, emergency, closed-loop, open-loop, alerting, logging downloading, etc. Hopefully no one in their right mind is even looking at wireless for emergency or anything that is completely critical. We're looking at non-critical closed-loop down to condition monitoring and logging. GM: Traffic requirements: 1/s to 1/hour (average < 1/min), 10s B to 10s KB, 10 ms to 100ms to 1s. Asymmetric traffic patterns. GM: Looking at 10s to 100s of nodes. We don't expect to build networks with 10's of thousands of nodes all communicating over 15 hops and doing something useful. Instead, we plan to rely on a wired backbone to get off the wireless link as quickly as possible. Expect a small number of hops, support lots of routers to get on the backbone, requirement for broadcast and multicast support across the routing infrastructure. CP: Can you explain the latency requirement? 10ms? GM: No, I can't. DC: What is clear is that the latency and size requirements are different for the different traffic classes. CB: When we talk about latencies, we need to qualify whether that's hop-by-hop, end-to-end, whether it includes retransmissions, acknowledgments, etc. JP: I've always been very reluctant to specify hard numbers, because we've had to live with 50 ms for many years, coming from SDH. So these numbers should be treated as "order of magnitude", no hard numbers. GM: Need to get more feedback from industry about the requirements in this space. JP: We did ask a few large companies to participate, so hopefully we can some end-users giving feedback. 5) Urban WSNs Routing Requirements in Low Power and Lossy Networks draft-dohler-rl2n-urban-routing-reqs-00 (Christian - 20mn) [75] CJ: Scale at tens of thousands of nodes, sensors, actuators, and access points. Multi-functional devices: access points can serve as gateways, data sinks, data sources. Topology may be flat or clustered. GM: We need to get our terminology right, I'm hoping you meant "routers" and not "gateways" because we are talking about IP here. CJ: We're talking of routers. DO: Is it required for these clusters to be self-organizing or administratively organized or both. CJ: Traffic characteristics: directional flows, regular and spontaneous, sensed data are time and space correlated. ??: Is there a requirement for peer-to-peer requirement communication in an urban network? CJ: not anticipated; however, later JP pointed out sensors in a cluster may communicate in a mesh to aggregate data CJ: Deployment in batches out-of-box. Addition and removal of nodes while network is in operation. DO: Do you mean that is possible take a device from the factory directly to the guy who installs it? or do the devices arrive at some central location from the manufacturer, programmed in batches, and then handed to installers (i.e. made consistent by a middle-man). CJ: It is the latter. DC: Important point here is that there are times where there will be a large number of devices within close proximity and you still need to communicate. CJ: Routing requirements: unicast, multicast, groupcast; scalability; constraint-based; self-organizing. Chandra: You mentioned 10,000 nodes, have you talked about the size of the databases? JP: This is one of the metrics in the protocol survey. David will talk about that soon. DO: It's one thing to say you need a multi-metric function and optimized. It's another thing to say you are constraint-based. Because one constraint could say "a message will not consume more than X amount of energy." DW: One way to say is that we would rather not communicate if such communication will use more than X CPU cycles. DO: Routing community has a very different notion of constraint-based routing. It's easy to say and hard to do. JP: Right, and we are talking about constraint-based routing. GM: Why isn't energy utilization on a device a constraint? DO: There is a difference between a constraint, a threshold, and a metric. Question is what does it really mean DC: Constraint based routing has a technical meaning and we need to be particular about that and not just use it in the colloquial sense. JP: The MUSTs and SHOULDs are very important because we will use them for protocol survey and developing a protocol if needed. Draft sill needs - node mobility - traffic patterns 6) Overview of Existing Routing Protocols for Low Power and Lossy Networks draft-levis-roll-overview-protocols-00 (David - 25mn) DC: Want to setup discussion, intention is not to dive down into any of the analysis in detail. DC: Goal: platform for discussion or building consensus through that discussion. Suitability, ill-suitability, and technical trade-offs in utilizing existing IETF protocols for ROLL. Not to design a final protocol. DC: Outcome is that we have a table with protocols and criteria, and we fill in the cells. MT: You might have a scenario that is a good fit for something and a requirement that is a long way away. But to add that one requirement that is really a long way away can be very costly. It's really hard to pick off different requirements. JP: We will certainly have to make trade-off. DC: You mean, picking off columns. MT: That's right. DC: Simplifications are the same (e.g. sending packets 1/hour vs. 1/sec) MT: That's right. One suggestion is to consider whether or not it significantly biases your cost for a requirement that is used rarely. JP: That is a perfectly sensible approach. MT: In the IETF, there's a long history of accounting for rare but high-cost requirements. Maybe graphically draw out where things fit. GM: Are the protocols only IETF approved protocols? As opposed to some dissertation protocol? JP: We will only consider IETF protocols here. DC: For the purposes of this framework, we are dealing with standard protocols that we understand. But it doesn't preclude others from using the framework. GM: Concerned that we are going to put weightings on the criteria. It will vary from one app to the other. JP: You're right, it will vary, but we need to take that into account. GM: But not in this document. We will consider that in a final document that merges things together. JP: We had a discussion on whether or not we should have separate documents. To be determined. IC: I think it will be very challenging to come up with a correct set of protocols. Reacting to GM's mention of what variant of the protocols actually fill these fields. There's a lot of variants/optimizations, some of which are compliant with the standard, others that are not. We should also consider such optimizations. There's been a bunch of work there. MT: It's almost as if you need a completely different forum for choosing different variants and evaluating them. As it stands, it sounds reasonable. But leave a tick in the box for, maybe some day, we will address this. DC: What is different about ROLL? Bottom line is there is no real underlying topology. Degree or depth is derived from the physical placement of nodes within the network. That's true for all of the routing within wireless, not unique to ROLL. What is unique to ROLL: scale, degree (density), depth (physical extent/range), churn (some inherent change in topology, not necessarily controllable by the protocol/nodes), communicators (not everyone is active all of the time), destinations (not all nodes are sinks). Chandra: A lot of the parameters are already applicable to other routing work. However, the value of the parameters may be unique here. You should feed some of these parameters back to the correct routing group. DC: If there is a correct routing group. We formed this group to find out. Would be perfectly acceptable outcome. IC: A lot of these are inter-related and may have other functions. These may vary significantly over time. DC: Absolutely. This is just a way to do the calculation at all. Different applications may cluster very differently. And we need to attend to that. IC: Besides the number of communicators, the number of hops is influenced by the communication. MT: A challenge is some believe in OSPF, some believe in IS-IS, and we are here because we don't know which way to go. JP: We are aware of the challenge and co-chairs are addressing IETF issues carefully. MT: We're in the box because at the end of the day, we're churning out what is already done. DC: At this stage. MT: But if we're selecting what we've already done, we're in the box. GM: No, but the answer may be we select nothing. MT: Then we're out of the box, and that's interesting to me. But some of us has bet with their careers. DC: One easy solution may be to say "we're going to build a new one." MT: I think that's the hardest one. JP: Read the charter. MT: I read it, but I don't believe it. DC: Constraints: lifetime, physical size, rate of activity, short range, high loss rate, small MTU, low rate links. Low data rates typical, implying that routing protocol rate must be significantly less than the application rate. Chandra: At some point, are we going to put a number on small MTU? DC: We'll put a family of numbers. One answer is 128 bytes, that's the 6LoWPAN answer. DC: Constraints: routing protocols that target microcontrollers not microprocessors. DO: Need to differentiate between state in forwarders and state in nodes that don't forward. Just putting another dimension on the cost, which is the degree of uniformity or non-uniformity. DC: full-function/reduced-function/special-purpose devices will be considered. CP: MTU matters, but you can also say packet size matters. And I find it very rare for people to say that packet size does not matter. DC: Challenges: low-power, lossy, can't throw extra resources at it. What is a link? We can tackle such a problems because there are existence proofs today. Others have solved it and are shipping product. DC: Start of process: important to have analysis template. Each entry will have considerable supporting evidence. Goal is common understanding of facts, no "winning the match". Expect an active, interactive exchange between Philly and Dublin. CP: Why does the routing protocol rate must be much less than the data rate. DC: Are you willing to require more for the routing protocol than what you put the network in for? DC: won't be Carsten: Sure, why not if it makes the network work better? E.g., smoke detector only sends data once DC: Mostly boils down to low-power DC: Yes; tried to establish relationship with "lossy derives from low-power" CP: Yes. The most important case is that it works. I'm not sure that this is an actual requirement. ??: There only seems to be discussion about the operation of the routing protocol rather than the information you need to store. DC: This draft only gets to talk about existing protocols. CP: What you really care about here is the power, right? You want to make sure that high routing overhead doesn't diminish the lifetime of your network. DC: You're right. But sometimes its easier to count things, such as counting the number of packets. One example, can I afford to tell everyone because a particular link went down. CP: Certainly happy to agree, in the case of smoke detectors you may have infinite routing overhead compared to the data traffic. Ideally, we have none. CP: Some discussion of whether we're in the box or out of the box. Shouldn't there be a favored possibility about where we determine the deficiencies of certain protocols and then we feed that information to the appropriate routing group. But if they say "no", the we naturally progress here. JP: If we can find a possible candidate by tweaking, then we are more than happy to discuss with the appropriate WG. CP: A favorable situation is having multiple WG looking to fulfill your needs. Justin: Comment that a lot of the consideration is about power, not about the loss. It is a lot harder to characterize. DC: Churn is getting at the loss and most protocols are designed with how to deal with links that change. So it eventually turns back into bandwidth and power. DO: One question is that distinguishing the definition of the protocol with bits on the wire vs. implementations on what the can do. Other question is whether we get to permit layer violations. For example, overhead, timers, etc. can be optimized with the application's use of asymmetric flows. Same thing for layers below routing. What I'm really saying is how natural it is for certain protocols to utilize such optimizations. JP: Right. Just like other examples of "L2-agnostic" isn't really "agnostic". Layer violation is definitely a topic of importance here. GM: Routing management doesn't show up in any of the equations. Centralized vs. uncentralized. What do I need to crunch on the various graph? How does all this work? What's acceptable in industrial may not be acceptable for my mom. DC: That's a fair concern. There's another concern about the management characteristics. I may only get a packet every 10 minutes. But when I need to manage the network, I may need delays on human time scales. JP: It's true. This document doesn't address it and must be added. GM: David was saying that the list may be too long. But the list may be too short as well. JP: Thanks for this first active and productive meeting. We're done. Looking for some useful things before and in Dublin.