You are currently browsing the tag archive for the ‘Google Custom Search Engine’ tag.
I noticed an interesting Jisc-funded project at Liverpool today that I hadn’t previously heard about (blogged by Jisc today) that talked about a method of sharing resources amongst students using a crowdsourcing approach. The service is called Kritikos and takes several quite interesting approaches. At the heart of the system is some work that has been done with students to identify resources relevant to their subjects (in this case Engineering) and also to identify results that weren’t relevant (often because some engineering terms have different meanings elsewhere – e.g. stress). That’s an interesting approach as one of the criticisms I’ve heard about discovery systems is that they struggle to distinguish between terms that are used across different disciplines (differentiation for example having separate meanings in mathematics and biology).
The search system uses a Google Custom Search Engine but then presents the results as images which is a fascinating way of approaching this aspect. Kritikos also makes use of the Learning Registry to store data about students interactions with the resource and whether they found them relevant or not. It seems to be a really novel approach to providing a search system that could go some way to address one of the common comments that we’ve been seeing in some work we’ve been doing with students. They feel that they are being deluged with too much material and struggle to find the gold nuggets that give them everything they want.
Kritikos looks to be particularly useful for students in the later stages of their degrees, where they are more likely to be doing some research or independent study. One of the things that we are finding from our work is that students at earlier stages are less interested in what other students are doing or what they might recommend. But possibly if they were presented with something like Kritikos they might be more inclined to see the value of other students’ recommendations.
The rise of the datastore
The last year or so has seen the growth of a new type of resource available on the web, the “datastore”. Datastores are collections of data, generally, but not necessarily government data, although usually authoratitive. Examples include DataSF from San Fransisco, Chicago City Data, the UK Government datastore data.gov.uk, the London datastore and the Guardian newspaper’s datastore.
The defining aspect of datastores is that they provide ‘raw data’ collected together in one place, rather than being spread across many different government and other websites. That raw data can cover a wide variety of subjects, from mortality statistics and Indices of Deprivation, through to ‘how many miles of high-speed railway’ and FTSE100 Directors’ pay. Generally the data is presented in the form of tables, often in Excel or CSV format (or exportable in those formats).
Alongside the benefits of having the data collected together in one place, and in many cases having data that has never been made available publicly, datastores offer the potential to start to present and analyse the data through visualisations and data mashups – by combining data from more than one source and exposing connections. There are a few examples on Tony Hirst’s blog and an example below using Many Eyes of a visualization of London population.
Challenges for libraries and librarians
Although in terms of discovery, datastores help by collecting together relevant data in one place, such as on data.gov.uk , datastores do still present some specific challenges for libraries and librarians. There are still some discovery challenges, but I would consider that the biggest challenges are around librarians getting to grips with exploiting the data within the datastores.
The challenges in this area are about finding the datastores and understanding what is contained within each datastore. But these are skills that librarians are used to using to find and assess resources so shouldn’t present much of a challenge. Techniques such as building Google Custom Search engines to search datastores can help with finding relevant data within these resources.
Building a custom search engine to search the London, UK Government and Guardian datastores is fairly straightforward, so I’ve built a quick example at http://www.google.co.uk/cse/home?cx=009989586971183011327:nt82dyi0ehc
Using this form of search engine makes it simple to discover which datastores have datasets that may be of interest.
Where I think it starts to become more difficult for libraries is in exploiting the data in the datastores. There is a question here about the role of the librarian. Is the role to just find the data, check its quality and promote it to academics and students?, or is there a role to help users to find ways of using the data? The latter role implies a much deeper understanding of how the data can be used, not just being able to export the data in a spreadsheet and produce a nice visualization, but also to know how to use APIs to dig into datastores, to use tools such as Yahoo Pipes to take data and transform it. The question is how much librarians and libraries see that as their role, and how much do they see their role as being that of supporting students and academics in exploiting the data, by learning and teaching the techniques to understand and exploit the data.
Obviously some librarians are more comfortable playing around with data than others, but the interest among librarians in the Mashed Library events indicates that a growing number of librarians are starting to appreciate that this is an area relevant to libraries. But over the years libraries and librarians have had to get to grip with several generations of new technological innovations, from CD-ROMs, through the world wide web to RFID and in each case librarians have taken on board new skills to exploit the new technologies and help their users.