Wednesday, 18 March 2009

My Faves for Tuesday, March 17, 2009

A succinct summary of the outcome of the UKRDS (UK Research Data Service feasibility study) meeting on 26 February on Neil Beagrie's blog, with links to the executive summary report and presentations from the international set of speakers.

The full event was also blogged extensively by Chris Rusbridge at, (see also continuation posts 2, 3, and 4).

Andy Powell's blog,, summarises his critical live twittering of the event and includes a number of comments by others.

[tags: website, report, service, data management]

See the rest of my Faves at Faves

Wednesday, 11 March 2009

My Faves for Tuesday, March 10, 2009

The Guardian have compiled a Data Store whose aim

“is to make important data more accessible to people.”

It consists of a set of links to data and statistics pertaining to a range of contemporary subjects including: Migration, Education, Health (UK and beyond), Military, Politics, Unemployment, Finance.

Within each data page there’s a link to a Google spreadsheet where you can see, download and manipulate the data.

The accompanying Datablog provides an avenue through which such statistics and data can be discussed.

[tags: data sharing]

See the rest of my Faves at Faves

Friday, 6 March 2009

Research data into Fedora at Oxford

The JISC funded DISC-UK DataShare project in Oxford has brought together several units within the collegiate University: the Oxford University
Library Services, the Nuffield College Data Library, the Oxford University Computing Services and the Oxford e-Research Centre.

This post looks into some of the work carried out by my colleagues in the Library to explore ways to manage research data into Fedora. These efforts are recounted in the blog of Ben O'Steen, Oxford Research Archive Software Engineer. 

Some months ago Ben already provided an exceptional account of the challenges encountered when ingesting a research dataset into FEDORA. He described how he dealt with the modelling and storing of a phonetics dataset given to him on a DVD-R, containing around 600 audio files organized in a hierarchical structure.

In a more recent post Ben talks again about storing, curating and presenting research data. This time he focuses on tabular data and highlights the importance of capturing the implicit information (columns data types, table interlinks), keeping the original dataset as well as maintaining a version of the data in a well-understood format with a description of the tables in a machine readable way.

This post also identifies a gap in institutional and departmental  IT support for those researchers needing to store tables of data and suggests HBase as the type of basic service that could be provided to avoid the free-form tabular datasets as well as to educate researchers.

All this work has been taking place in parallel to the scoping study I have been conducting in the last 15 months to scope the requirements for services to manage and curate data. This project is, like DataShare, finishing at the end of March but there will certainly be more data management and curation related activities in the University of Oxford.