You are currently browsing the monthly archive for February 2010.
I had the opportunity to go and listen to Martin Weller (@mweller on twitter) and Nick Pearce (@drnickpearce) talking about their work on Digital Scholarship this morning. I’d put together some thoughts last year on an earlier blog post – Digital scholarship and the challenges for libraries – so it was good to get an update on how the work is moving forward.
Digital Scholarship context
Nick Pearce set the context for Digital Scholarship with a short presentation – available on slideshare here. Looking at technology first he set out the view that books and language could be viewed as ‘technologies’. Books as a technology wasn’t too contentious for a room full of library people. Language as a technology is a bit more of a stretch but if you view it as a tool to enable change in a community then it’s a good analogy. His comment that ‘old technologies often persist – for good reasons’ was particularly interesting and the classic example is radio continuing alongside TV. But I’d wonder if these two technologies are fulfilling exactly the same role or whether they have established different roles for themselves.
Turning to digital technologies he pointed to the large number and wide variety of services. Using Ludwig Gatzke’s image of the incredible range of web 2.0 services as an illustration of how this year’s favourite technology is next year’s history. Many of the services shown in the image no longer exist and the list doesn’t show services such as twitter that are currently very popular.
That points to a real risk that you choose to adopt a technology platform that turns about to be transient or you find that ‘a year later everyone has moved on’.
Nick then looked at some of the key features of the digital environment and suggested that only a small number of users were actually creating content (which gives me pause for thought given the enormous growth that sites like YouTube are experiencing with user-generated content), and that you are relying on sites that are in perpetual evolution, effectively constantly in beta-testing.
Technologies, issues and challenges
Turning to scholarship and using Boyer’s “Scholarship reconsidered” model we started to briefly look at what technologies, issues and challenges might present themselves for the four elements of Boyer’s model.
Ideas that came up include the ever-increasing amount of data (data deluge), challenges in economics and funding, and issues around social networking. Nick went on to give some examples of Open Data (e.g. datacite.org), Open Publishing (the Open access movement), Open engagement through blogs and twitter feeds from people such as Richard Dawkins, and Open education (Open Learn and OERs).
“the Open Scholar is someone who makes their intellectual projects and processes digitally visible and who invites and encourages ongoing criticism of their work and secondary uses of any or all parts of it–at any stage of its development.” Academic Evolution
Digital Scholarship work
Martin Weller then took us through the work that is being carried out to investigate digital scholarship. This comprises three elements:
- promote digital scholarship
- work on recognition
- research current practice
It was interesting to hear of the work to create a new digital scholarship hub DISCO that is being launched shortly, and good to get a brief preview of it. Martin talked about his aim to formulate some kind of ‘magical metric to measure digital scholarship’ and it would be interesting to see how this sort of scoring system could be used – take the scorecard along to your appraisal with the results? Aims included trying to decide what represented good public engagement and working on case studies that academics could use as part of their promotion case.
Martin briefly covered some of the issues around digital scholarship including issues around rights, skills, plagiarism, time and quality/depth. We then spent a little time looking at issues, benefits and what we’d like to change. The sorts of things that our group talked about included: difficulties of getting people to engage; the lack of awareness of what the technology can do and concerns about quality in comparing peer-reviewed journals with blogs, for example. For the library we thought there was a fit around the library increasingly focusing on electronic rather than print resources but there are challenges around managing and curating access to material in social networking environments that may be ephemeral. The issue of persistent identifiers to this type of material is a real concern.
Finally, in an all too brief session, Martin flagged up the JISC Digital Scholarship ‘Advanced Technologies for Research’ event on 10 March 2010.
It was interesting that the presenters had slightly different perspectives on Digital Scholarship. It would have been good to have a bit more time to talk through some of the discussions and have more feedback, but time was a bit limited. It is fascinating to hear at first hand some of the work that is taking place to map out equivalencies between traditional academic practice and potentially new academic practices. It would be good to get some of the counter-arguments as to why some people don’t think that blogs and suchlike are equivalent to traditional practice.
For libraries the issues are especially around discovery and providing access to the material. A colleague made the point that librarians can’t evaluate the content in a blog as they don’t have the subject knowledge. At present evaluation of resources is as much down to evaluating the quality of the publishing medium, e.g. it’s in Nature or a reputable resource so it should be appropriate. With blogs librarians don’t have that context to use.
And the other big issue for libraries is persistence of links. A whole technology industry has grown up around these problems e.g. SFX, OpenURLs, DOIs etc etc and work is going to be needed to work out the implications of content migrating from a few hundred aggregrated collections of peer-reviewed academic journals to many thousands of individual resources in the cloud. But maybe this is where technologies such as Mendeley come in?
The rise of the datastore
The last year or so has seen the growth of a new type of resource available on the web, the “datastore”. Datastores are collections of data, generally, but not necessarily government data, although usually authoratitive. Examples include DataSF from San Fransisco, Chicago City Data, the UK Government datastore data.gov.uk, the London datastore and the Guardian newspaper’s datastore.
The defining aspect of datastores is that they provide ‘raw data’ collected together in one place, rather than being spread across many different government and other websites. That raw data can cover a wide variety of subjects, from mortality statistics and Indices of Deprivation, through to ‘how many miles of high-speed railway’ and FTSE100 Directors’ pay. Generally the data is presented in the form of tables, often in Excel or CSV format (or exportable in those formats).
Alongside the benefits of having the data collected together in one place, and in many cases having data that has never been made available publicly, datastores offer the potential to start to present and analyse the data through visualisations and data mashups – by combining data from more than one source and exposing connections. There are a few examples on Tony Hirst’s blog and an example below using Many Eyes of a visualization of London population.
Challenges for libraries and librarians
Although in terms of discovery, datastores help by collecting together relevant data in one place, such as on data.gov.uk , datastores do still present some specific challenges for libraries and librarians. There are still some discovery challenges, but I would consider that the biggest challenges are around librarians getting to grips with exploiting the data within the datastores.
The challenges in this area are about finding the datastores and understanding what is contained within each datastore. But these are skills that librarians are used to using to find and assess resources so shouldn’t present much of a challenge. Techniques such as building Google Custom Search engines to search datastores can help with finding relevant data within these resources.
Building a custom search engine to search the London, UK Government and Guardian datastores is fairly straightforward, so I’ve built a quick example at http://www.google.co.uk/cse/home?cx=009989586971183011327:nt82dyi0ehc
Using this form of search engine makes it simple to discover which datastores have datasets that may be of interest.
Where I think it starts to become more difficult for libraries is in exploiting the data in the datastores. There is a question here about the role of the librarian. Is the role to just find the data, check its quality and promote it to academics and students?, or is there a role to help users to find ways of using the data? The latter role implies a much deeper understanding of how the data can be used, not just being able to export the data in a spreadsheet and produce a nice visualization, but also to know how to use APIs to dig into datastores, to use tools such as Yahoo Pipes to take data and transform it. The question is how much librarians and libraries see that as their role, and how much do they see their role as being that of supporting students and academics in exploiting the data, by learning and teaching the techniques to understand and exploit the data.
Obviously some librarians are more comfortable playing around with data than others, but the interest among librarians in the Mashed Library events indicates that a growing number of librarians are starting to appreciate that this is an area relevant to libraries. But over the years libraries and librarians have had to get to grip with several generations of new technological innovations, from CD-ROMs, through the world wide web to RFID and in each case librarians have taken on board new skills to exploit the new technologies and help their users.