Wednesday, 19 November 2008

Consorcio Madrono - research data seminar

Stuart Macdonald and Luis Martinez were invited to speak at a research data seminar (http://www.consorciomadrono.es/noticias_eventos/evento11.html organised by Consorcio Madrono - a consortium of 7 Madrid university libraries. The audience consisted of primarily of library managers and directors but also included researchers, IT and e-research/e-science specialists. With the aid of translation professionals Alicia Lopez Medina (UNED) gave an interesting overview of current initiatives in Spain including those relating to research data management, e-research and repositories. She indicated that a concerted effort is required in Spain to address issues surrounding the 'Data Deluge' and data management in academic settings as currently there are no platforms in place to cater for research data in an Institutional Repository environment. Luis introduced the concept of research data with typologies and examples, explained data management and data curation with their associated benefits and challenges, finishing with the findings of the study currently taking place at the University of Oxford - Scoping digital repository services for research data management. Stuart introduced the concept of data libraries (as they exist currently in UK tertiary education) and made mention of DISC-UK. He then discussed the DataShare project showcasing deliverables and anticipated outcomes, and the Data Audit framework. He finished by talking about Web 2.0 data visualisation tools that can be employed independently of or potentially integrated into a repository environment. Celia Russell (ESDS International) then detailed why research data is worth preserving and provided an overview of instistutional, national, european and global level data infrastrutures with particular reference to e-science and national/international grid environments and initiatives. The day ended with an extended and lively panel discussion which proved as fruitful to panelist as it did to the enthusiastic audience.
What became evident during the discussions was the generic use of the word 'data' to describe various different kinds of digital research output – highly curated collections, large scale dataset gathering, lab-based data generation, secondary social-science data (researcher-created data) and research data products and summary data. These all have different patterns of generation and use with variable lifetimes and life cycles. Distinguishing between finished research data packages and on-going data production and analysis also needs some thinking out. There are scale, storage and format issues. There is perhaps some scope to delineate between pre-publication and post-publication data - the latter being more likely to be repository (and data librarian) friendly, the former the domain of a new breed of 'data scientists' conversant with subject but with less interest in metadata, discovery and preservation. It may well be time to discuss each of the patterns of generation and use individually with a view to establishing where they differ and where there are commonalities, in addition to articulating curatorial roles, responsibilities and relationships for each of said patterns of generation and use!

Plenty of food for thought!

Presentations from the seminar will be posted here soon.

Stuart Macdonald
DISC-UK DataShare

No comments: