Saturday, 28 August 2010

WissKi project for scientific collaboration and data sharing

As part of my CLAROS-related activity, I've been taking a poke around the WissKi project (, which is a German-funded, Drupal-based collaboration platform for scientific research and data sharing, and which also uses CIDOC-CRM as a base ontology.

Generally, this looks like an interesting project and I wonder if we shouldn't be looking to establish links with other data management work in Oxford and beyond. I have been asked to attend a WissKi meeting in September, so it will be interesting to see what common themes we can find.

Among other things, they have assembled a couple of useful ontology-related lists:

Wednesday, 25 August 2010

ADMIRAL Sprint 10 review and Sprint 11 plan

A new (half-time) developer started on the ADMIRAL project yesterday. After the usual administrative details, and setting up as development environment, we did a mini sprint plan for the next 2 weeks of the ADMIRAL project.  I say a mini-sprint plan, as we didn't do the full activity/user story selection, task breakdown and scope bartering, but rather reviewed the remnants of the most recent active sprint plan and identified key unfinished tasks to be tackled.

The next goal for the project is to complete the functionality covered by phase 1 of the project plan by the end of October, with front-to-back submission of research datasets to the Library Services Databank repository service, and providing visible web-based feedback to our research partners of the submitted datasets.  This we intend to use as the basis for iterative improvements and enhancements in phase 2 of the project, with the researchers guiding us concerning what constitutes useful metadata to capture and expose with the submitted datasets.

The sprint plan for the period to 7 September aims to:

  • review, debug and update documentation for the ADMIRAL system scripted creation procedure
  • create a new ADMIRAL file sharing deployment for the evolutionary development group
  • file store bug fixes (password over unencrypted HTTP channel; https access reporting server configuration error
  • progress work on Shuffl RDF support

Review of sprint 10:

Plan for sprint 11:

We've also had a technical meeting with the Library Services developer of RDFDatabank (aka Databank), the data repository system:

Monday, 23 August 2010

Gridworks for data cleaning?

I've noticed a fair buzz recently from open government data people about Gridworks, and specifically this blog post from Jeni Tennison:
I'm reminded of some problems faced publishing the FlyWeb data (, and also of some discussions with Alistair Miles about tooling for cleaning up Malariagen data (

Unsurprisingly, similar problems appear to be faced in publishing government data as open linked data, and the solution that is finding favour there is Gridworks. If it works for them, then I figure it should also work for some of the research data data we are trying to deal with.  I'm thinking this is something we should look to explore in later phases of the ADMIRAL project, under the broad heading of building more formal structures around raw data (WP6).