Miskatonic University Press

Access 2009, Thursday #2: Richard Akerman - Will We Command Our Data?


Richard Akerman blogs at Science Library Pad. Peter Zimmerman blogged this talk too.

My notes from his Access 2009 talk:

Talking about data. Storage has gotten incredbly amazing. Can store huge amounts of data on tiny spaces. Data floating around in the cloud doesn't seem real, but it is, and it takes a lot of energy and hardware to store it: electricity, air conditioning, etc. Carbon emissions.

Four sources of data: research data, government data, library data, personal data.

Research data: lots done about storing it, giving access to it.

Government data: has really opened up in the last year.

Library data: in catalogues, access logs, id.loc.gov, etc.

Personal data: easy to share your GPSed location, all other personal data, on line.

Everything about sharing data is getting easier: the value of it (more can be done with it by others), the ease of it, and the level of it.

OECD agreed: data from publicy-funded research should be released to the public. One reason this isn't controversial is that publishers aren't in, were never in, the business of publishing data. So data is an easy way to get into open access.

Toronto statement on prepublication data sharing: http://www.nature.com/nature/journal/v461/n7261/full/461168a.html

Open up the data before any papers based on it come out. Say, I'm going to write about this, but go ahead and use this data however you want.

In libraries: Berkeley Accord (March 2008). Basic rights to access to data in library systems. All vendors but one signed on (Innovative Interfaces)? Though how well have they implemented?

Personal data. WIRED cover feature "Living By Numbers" and personal data tracking (July 2009).

Why libraries? Advocates, exemplars, experts.

If lots of data is made available, how is it made findable? Need solid metadata and classification to make it easy for people to find, otherwise it's just a big mess of numbers.

http://datacite.org/ DOIs for data

NRC/CISTI: Gateway to Scientific Data Sets http://cisti-icist.nrc-cnrc.gc.ca/eng/services/cisti/scientific-data/data-sets/

Crown copyright in Canada makes it hard to give away government data (which is another example of how stupid it is), but this project is on: GeoGratis: http://geogratis.cgdi.gc.ca/

How can libraries connect to their patrons?

  • LibraryThing's free covers
  • Open Library
  • Talis Connected Commons
  • id.loc.gov

APIs vs raw data. APIs: always serves up latest data, control over access, tracking/stats, complex functionality. Raw data: unconstrained access, not limited by API, no metadata

Book about recording of personal data: http://totalrecallbook.com/.