JISC has organised a grouping of projects interested in data curation in and out of repositories, including DataShare and the Data Audit Framework. One of the key areas of shared concerns is training.
Saturday, 26 April 2008
Friday, 25 April 2008
ORE will develop specifications that allow distributed repositories to exchange information about their constituent digital objects. These specifications will include approaches for representing digital objects and repository services that facilitate access and ingest of these representations. The specifications will enable a new generation of cross-repository services that leverage the intrinsic value of digital objects beyond the borders of hosting repositories. Software developers used OAI-ORE at the recent Open Repositories Conference to move digital objects to and from different repository software platforms as proof of concept
Tuesday, 15 April 2008
Jordan Hatcher's Open Data Commons Public Domain Dedication and Licence launched at the Open Knowledge Conference (OKCon) at the LSE (Mar. 2008). The ODC PDDL is a document intended to allow you to freely share, modify, and use this work for any purpose and without any restrictions. This licence is intended for use on databases or their contents (”data”), either together or individually.
Wednesday, 9 April 2008
A report-back on the recent Research Data Management Forum organised by DCC with RIN, with links to key presentations.
Wednesday, 2 April 2008
One project they support is Open Shakespeare, where public domain works borrowed from Project Gutenberg are annotated and otherwise value-added. The ambition of the foundation can be seen in the attempt to build a Comprehensive Knowledge Archive Network in which they hope to catalogue all open projects and collections.
In the first session on Transport and the Environment, Gavin Starks from the climate-change-aware organisation AMEE (Avoiding Mass Extinction Engine) reported on the government agency DEFRA’s call last year for an open service provider for carbon footprint data. The result is that 107 developer API keys had been requested in 6 months for the CO2 calculator service. We also heard from Tom Steinberg at MySociety, that they hosted the "largest collection of broken pavement slab photos" on the Internet at www.fixmystreet.com.
Muki Haklay reminded us in a talk on Open Environmental Information that there is a long history of government regulation from the 1972 Stockholm Declaration through to the 2004 UK Environmental Regulations of the Freedom of Information Act. He demonstrated websites that evolved from the UK Friends of the Earth “What’s in your backyard?” campaign, emphasising that “Information needs to be linked with action,” and that open information is not enough: Skills such as spatial literacy, technical literacy are needed to make sense of the information.
He urged those mashing up data to “take it seriously” and present the information in a useful way. He also asserted that Web 2.0 is overly focused on individuals and not groups, and that recent technological development is disempowering of small organisations. As an example, he noted that the “open space license” made available from Ordnance Survey for free APIs is to individuals only.
During the next session we were given a scientific take on open source software (
Erik Duvall from ARIADNE (not the
Among his other observations on openness, he noted
Among his other observations on openness, he notedthat there is a move from problems of scarcity to problems of abundance and scale, where findability is an issue, and attention becomes the scarce resource.
Lisa Petrides then discussed the OER Commons (based at iskme.org). The DIKA model they developed stands for Data --> Information --> Knowledge --> Action. An example is the Library of Congress historical image collection on Flickr, being annotated by the public to generate new knowledge. She called for a re-professionalisation of teachers as curriculum creators rather than merely delivery person. They take on exciting new roles of authors, re-mixers, online collaborators.
An interesting question from the discussion was “By encouraging the use of e.g. Slideshare in education, are we encouraging mass copyright infringement?” (Answer: this is not our problem.)
The conference then broke into two sessions. Yours truly got her nerve up to give a “lightning presentation” on the DataShare project when two speakers didn’t show. [These sessions were all video'd, so may show up on the OKF website at some point.]
Developments of DBPedia were reported on: turning wikipedia content into semantic web. This is easily seen in the structured content that now turns up in Wikipedia infoboxes, which can be queried (see HP entry for an example). A nice slide was shown of the semantic web layer cake. “Linked Data” uses http URIs as names for things. For example, cities in wikipedia are matched to their equivalent label in Geonames. One begins to imagine a much more powerful wikipedia if the vision of DBpedia is realised.
A lesson from Delicious was offered: the myth is that its success was because people wanted to share bookmarks. This is false: it was simply the best way of organising one’s own bookmarks.
Dave Puplett from LSE gave an overview of the problems of versioning in repositories and introduced the framework the VIF project produced for recommendations for correctly identifying a version of a work. (Tantalisingly, he didn’t tell us what they were, so you’ll have to read the framework to find out.) However he did explain why date alone was not a reliable method.
Mark Birbeck gave a preview of an upcoming standard from W3C, RDFa, which emerged from XHTML 2). RDFa will unlock the metadata in web pages and encourage people to add more, by building on features already in html. His analogy was how blogs made producing html web pages easy; the trick is making metadata easily published within web pages and therefore indexed by search engines. Additionally, objects such as jpegs embedded within a web page can be identified separately. Yahoo! is already indexing rdfa and soon will be indexing microformats.
One highlight from the other parallel session that I missed was the launch of the Open Data License by Jordan Hatcher. This should help those who want to publish data openly on websites and in repositories by providing a creative-common type license specifically designed for data.Robin Rice