The P2PRG meeting at IETF 65 had 60 attendees. Many thanks to the following individuals who made presentations at the P2PRG meeting in Dallas: Mario Kolberg, Stirling Ian Downard, NRL Keith Ross, Polytechnic The comments I received afterwards from various individuals were all positive. Slides for the presentations can be downloaded from the CORE subgroup page: http://www.cs.uml.edu/~buford/irtf-p2prg/ During the closing discussion, the following topics were proposed as potential research areas for the P2PRG: 1. Security Notion of "probabilistic security" could be considered 2. "search" vs "lookup" in DHT GloServe (Columbia) : CAN + Ontology prototype 3. Privacy vs ability to search 4. Management of P2P systems Churn, traffic, poisining, node status, ... 5. Internet transport and overlay behavior 6. NATs & P2P (The minutes were compiled by John Buford) ---------------------- A more comprehensive set of minutes were compiled by Jessy Cowan-Sharp from Maryland (and edited by me). Here they are: Agenda bashing/announcements Website at UMD with bibliography, info on mailing lists, previous docs, etc: cs.umd.edu/projects/p2prg. -Link from main ietf RG page as well. P2P research group -Idea is to try and solve interesting problems in p2p -Can participate by joining subgroups or by proposing new subgroups by contacting one or both of the chairs, commenting on draft charter, etc. Two subgroups in this RG: content and resource discovery, and p2pv6 -Not currently linked from website but will be soon -Core subgroup site: some background, milestones, etc There is a new process for IRTF RFCs -There was a fair bit of interest from the RGs to publish RFCs that were distinguished from other ones. -Streamline the ability for RGs to produce some longer term documents; should take less time to actually get an RFC published 4 p2pRG drafts posted recently -p2pv6 -Survey of Research -Fair file swarming and efficient lookup on unstructured topologies; also available on UMD site. Presentation: CORE Subgroup: Content, Resource and Service discovery Slides available on p2pRG website. Comments: Q: Ian Downard: read the draft. Seems to be big emphasis on global scale problems he works on are unreliable; so just doing discovery on it is hard enough locally. That may be an issue that also deserves more attention in this draft. A: we were trying to avoid topics covered in other areas but that is a good point and this is an open area. CORE group project: survey Presentation of preliminary outline -looking for volunteers to help -Outline is tentative; hasn't been posted to list yet; want feedback. Summary of CORE group problem statement (see also draft online) -Set up definitions: what is service discovery, what is p2p, etc -different sub problems 1. global scale (what does it mean to say that it should have global scale) 2. desire to support various service formats out there today -would like the mechanism to work not just in large scale but in other limited/ad hoc network environments -service overlays: most proposals are building on top of existing DHTs, but this approach should work independent of that layer Feedback on the list was mostly about definitions and taxonomy, but not so crucial to the main problems. Other comments on the problem statement? Unidentified: Q: the proposal focuses on geographic topologies; why? Can see basing it on internet topology, but not sure about geographic. A: if you are a mobile node looking for service close to you, one way to do it is by knowing the geographic location Q: but if you are accessing the internet, you really care about round trip time, not geographic location A: yes that's true, but we are thinking about services which may require you to be physically present as well as virtually present (ie searching for a printer, or ordering an ecommerce item and then picking it up) Michael Slavich: Q: What I am hearing is that there is actually more than one possible descriptor of the location: internet based and physically based. Perhaps you want to use a media relay. So you need both. A: right, they are not exclusive. The beauty of the p2p indexing schemes is that you can put lots of redundant descriptions into it, and they can all support different query approaches. There was a paper on meta-search discovery, which was interesting. We want to be able to describe services by various kinds of meta data. So you can do look ups with basic indexing or move to more advanced queries. Unidentified: Q: not sure if it belongs in this RG, but you might want to say something about cost interactions between routing protocol and overlay. Another one-I remember seeing something about reliability being an option for middleware frameworks, and I think strongly that stronger pipes should be a requirement for middleware frameworks. A: yes, this is a pretty open area. It's not currently within the scope of the CORE group but it will be important as other activities start looking at p2p. Maybe its something we can bump up to the WG. In terms of reliability, its not something we have thought much about. Closing comments on problem statement -Would like to get a solid internet draft out of it. -Want more participation Presentation: Mario Kolberg : P2P Tools (slides available on site) Survey of p2p tools: based on publicly available information so if you have information on other tools please let them know. Also, any further information on how active the simulators are would be welcome Topology generators table: Asterix shows that the generator uses Brite to convert formats for the simulators. Questions? Comments? None. Presentation: Ian Downard, Naval Research Lab (Slides online) Why NRL is looking at p2p and why are they trying to solve the problem of service discovery. Will also discuss why they built AgentJ. NRL's interests -different from most of the military b/c focused on open source and open standards -Give away all software on website called "prodium forge" (http://pf.itd.nrl.navy.mil/srss) -considered to be experts in these types of networks within the military Most of networks they are looking at are very dynamic-ad hoc networks etc. Developing protocols for mobile, ad-hoc networks. Goals for next ten years: develop protocols that support autonomy in the networks. Multi-layered approach-look at many parts of the stack and try to understand how they can interact better, be more efficient, etc. Spend a lot of time on scenario generation, post these online too. Developed several protocols and tools for this area. AgentJ Purpose: allows use of NS2 to simulate java applications. Discussion of implementation, use and technical details Comments: Q: what version of ns2 are you using? If you already have an existing ns2 simulation, you can use agentJ on top of that? A: yes Q: so if you have a dht protocol and attach a more sophisticated search agent... Presentation: Keith Ross: Exploiting p2p systems and DDOS attacks In order to improve p2p designs of tomorrow, need to understand designs of today. -in particular, been looking at security issues wrt p2p. The major file sharing systems out there have been under major attack for years now (by the RIAA and music industry!); they can really be quite vulnerable. Types of attacks: -Poising indexing -Distributed DOS attack There is the possibility of both attacking FROM p2p systems as well as attacking TO p2p systems Index poisoning: corrupt the distributed indices by inserting bogus records into them. -Many p2p systems accept these records with no question. Overnet filesharing system -the engine behind e-donkey -uses DHT -Probably largest DHT deployed to date: hundreds of thousands of nodes -Still proprietary, so exact protocols not widely published DDOS attacks: -Poison distributed index -Poison overlay routing tables Q: the systems you looked at don't do any verification on index insertions? A: correct Q: so you could send in multicast addresses too... A: good point, we hadn't thought of that. Q: do you know other systems that are similarly vulnerable? A: Skype, but it has a proprietary protcol which is encrypted, so you have to reverse engineer it, but there have been some recent results in the area. Overnet is not encrypted, so it's much easier to figure out what is going on. Q: since we have a design efforts going on here as well, if you were to do a partial solution kind of thing, that might be useful. It seems like even some kind of verification would be sufficient to thwart many of these attacks. Is he missing something? A: so you try to verify a TCP connection? Q: not necessarily. The receiving node looks at the request and says "wooops, you want me to publish this but the IP addresses don't match." A: that may be sufficient, but NATs would be an issue. Phillip matthews: Q: where are you planning to take this work? Are you planning to suggest ways to avoid the problem? Do you think this is a generic problem? Or just bad engineering design? A: well, first of all we just want to alert the community to it. Anything with a DHT architecture will have distributed routing tables and indexes, which have the potential to be polluted. In fact many are already being polluted extensively. So just think we have to keep an eye on it. Q: have you done any work on poisoning as a participant? A: no, haven't done ay work here but sounds interesting. Q: security in p2p systems strikes me as a very difficult problem and an interesting one for this RG to look at. To make a plug, people are encouraged to attend p2pSIP sessions if they are interested in this and in some ways the problems overlap. Presentation: Keith Ross: TCP relay selection in p2p networks Just beginning with this research Why use TCP relays and how do you chose between available relays? Thinking more in the context of file transfer than VOIP, but could be used for this as well Q: might want to consider ipv4 and ipv6 interworkings in this work Q: actually for Skype relay services, each stream (channel) is a different relay potentially. Q: philosophical comment: this seems to be classical example of optimizing the personal vs social good. The reason TCP does inverse bandwidth relationship to delay is because it consumes more resources on average (? I'm not sure this is exactly what he said). If you deployed a system that does multi-hop routing with a linear type of forward, it's not clear that it serves a social good even tho it provides a private benefit. A: yes, that's true. Another question is how do you prevent this? Q: the other thing I'm worried about is using overlay networks for... . many of the benefits of these kinds of systems are essentially the "policy bypassing" of the philosophy of the internet. A: yes, also important. Q: I got the impression from the transport area that they're moving away from maximum windows on TCP. Wondering if application to relays of that problem is an artifact of the fact that TCP apps haven't caught up to bandwidth yet? And if so is this really just an optimization problem? A: not following closely what is happening in TCP world to enhance performance, but imagine will always have situation where competing TCP flows, amt of bandwidth will be proportional to 1/roundtrip time. Q: You focused here on performance. Should think about scalability and robustness of the network. The DTN group has been exploring concept of custody transfer, as you might not have continuous connectivity. Encourage consideration of robustness issues as this work progresses. Open discussion: Research questions and IETF? P2p SIP work has already been mentioned. Any other suggestions? Phil: Q: DHTs are very good at doing 'lookup' (specific key) but not so good at 'search', ie looking for more wildcard type keys. In p2p SIP systems, some kind of search capability is important. Not familiar with much work on this in the literature. A: good point. As far as what's been done, there is the survey draft that was mentioned earlier with many references to this. Also some work into incorporating semantic data into p2p systems. Q: we've been building a system which combines DHT 'can' (?) and a web ontology system (hybrid system). In terms of a research issue, there are many security considerations. Need to think about using different metrics for p2p systems. Not necessarily deterministic security, but perhaps we should look at probabilistic security. How likely is it that a certain type of attack will happen? Q: for very large systems... sometimes just by design of the system you get false hits. How do you fixed hash systems...multi dimensional room searches of namespaces... any research going on in this area? A: there is some work into unstructured p2p networks, which basically so some very sophisticated querying. But there are scaling issues. Some people want to see semantic web type behaviour over the network, and they might also be looking at this. Dave Thaler (?): Q: suggestion for a research topic: management of p2p systems. Using p2p systems TO manage things has been discussed, but not so much management of those networks themselves. What are the metrics for measuring this? A: haven't seen systematic discussion of that. Juxta has some capability to do status monitoring of its nodes. Phil: the effect of running algorithms like DHTs in the internet environment. Non transitivity in the internet. Also NATs: assumption is usually that everyone has public IP addresses, which is not always the case these days. Q: this brings up a tough design decision: persistent nodes that have little or flaky joins and leafs (?)... could be good research area. Great... next step is to take this to the mail list and discuss this further.