Data Set Identifier Interoperablity BoF, IETF 84 Chairs: Beth Plale, Ted Hardie Agenda: http://www.ietf.org/proceedings/84/agenda/agenda-84-dsii Presentations: https://datatracker.ietf.org/meeting/84/materials.html#DSII Recordings: http://www.ietf.org/audio/ietf84/ietf84-regencyc-20120731-1520-pm2.mp3 , http://connect.iu.edu/p2hoj2awzs5/ Note taker: Robert Ping Ted Hardie welcomed the group and went through the Note Well statement for the IETF, reiterating that this BoF was not intended to form a working group. Beth Plale then review the conceptual framework for work in this area, discussing both the framework for scientific data sets and the key role played by associating metadata with the generated data. Core issues for data sets and their identifiers are: discovery, data access, access control, logical arrangement, governance, distribution models, costs, relationship interoperability and service interoperability. The issues raised are particularly problematic for long-tail data, where the available funds and effort available to curate the data may be low. The group then reviewed several current Data Set ID Systems (please see slides) - EZID - Janee Handle System - Lannom EPIC - Wittenberg - CLARIN - EUDAT NI URI scheme - Farrell Discussion of current systems - Plale Question of Data and Data Sharing in terms of Earth Sciences areas: adoptions of DOIs is common, but others are creating their own or winging it. Being able to do discovery on top of this will take some agreement. Key question is: Are we at a pain point where we can get some agreement? Should we get agreement on information types and use that to create larger platform? Do we need something like the IETF to get this going? How do we collaborate? Also note that there are commercial uses - create a collection to stream from a cache on the network - include audio/video/close captioning/ads - each potentially with different data identifier types. We don't want to do this manually. Scott Bradner - Added as comment - Discussion this should include localization - I want to get copy that is correct for me - harvard may have local copy - need that copy vs IU copy. That’s a pretty powerful aspect of this. Leif Johannson - Another point - so what is the end game look like for succes for this? Do we pick a winner or does it remain a little of this and a little of that ? (Chairs reply that this is not “pick a winner”). John Levine - email and malware use management is a potential use case; they keep large files of spam (he also asked a question about DOIs resolving to a document) Melinda Shore - Are we talking about standardizing metadata or search interfaces? Not sure what is being asked in this discussion - how does IETF fit into this? - what part of squishy whole could have things to be done right now - mappings between existing systems - not defnitive metadata set but a way to map them together with some kind of registry - is a registry of those things a minimal success story - what other discovery or indirection would then be possible? Single comment -Need to get very specific about vocabulary on this issue. Andy Maffei - Woods Hole Oceanographic Institution - clearly identified as an interoperability problem, and IETF has extensive experience/techniques/approaches for how to develop interoperability. IETF can contribute valuable perspective to work that had been going on some time in sciences. Topics for future discussion: recommending limited# of PID schemes for science datasets, agreed-upon API for assigning identifiers for scientific datasets (EZID as a model), defining core non-domain-specific metadata. Agrees that realm of domain-specific metadata there is lots of v hard work that needs to be done (that is probably out-of-scope for IETF). Establishing interoperability frameworks and core metadata for science dataset identification is suggested as in-scope. Comments from Nassib Nassar - UNC Chapel-Hill-Yes, the problem is tractable; but because the existing ID standards diverge in solving slightly different problems, it would be important at the outset to define and limit the scope of the use cases that will be addressed. Interoperability considerations - Hardie The group discussed interoperability mechanisms briefly. The chairs concluded by thanking the group for the days discussion and asked folks to continue the discussion on the list.