You are currently browsing the tag archive for the ‘JISCAD’ tag.
JISC Activity Data programme and Learning Analytics
A couple of things this week about the activity data projects that JISC funded last year as part of their Information Environment programme. I noticed that Huddersfield are going to be doing some more work on LIDP (the Library Impact Data Project) over the next few months. This phase two includes work on more data sources and a possible data shared service. The screenshot on the left lists the work they are planning to do. More details on their blog. It will be interesting to see how this goes.
On Tuesday this week we did a short lunchtime session for library and other OU staff on the work we did last year on the RISE activity data project. So I did a short presentation on what we did in the project, and Liz (@LizMallett) covered the user evaluation and feedback. We also had a presentation by Will Woods (@willwoods) from IET on the University’s work around Learning Analytics. Learning Analytics has now become an important project for the university and it is interesting to see how this moves forward in the next few months. There is a short blog post on the event on the RISE blog here that includes embedded links to the presentations on RISE.
Moving forward with Activity Data
Since RISE finished we’ve been looking at ways of embedding some of the recommendation ideas into our mainstream services. We’ve still been routinely adding EZProxy data into the RISE database. At the moment we are moving the RISE prototype search interface and the Google gadget across to a new web server as we are closing down the old library website. That should keep the search prototype running for a bit more time. It’s also a chance to tweak the code and sort out any bits that have degraded.
Our website developer (@beersoft) has been building some new features based on the ideas around using activity data. The live library website already displays dynamic lists of resources at a title level in the library resources section on the website http://www8.open.ac.uk/library/library-resources.
One of the prototypes takes the standard resource lists (which are at a title level) and shows the most recently viewed articles from those journals, using the data from the RISE database. The screenshot shows one of the current prototypes. So users would not only see the relevant journal title (with a link at the title level), but would also see the most recently viewed articles from that journal. For users that are logged in it would also be feasible to show the articles viewed by people on their course, or even their own recently viewed articles.
We’ve been starting to think about how best to present these new ideas on the website as we want to gauge user reactions to them Thinking at the moment is that we want to keep them separate from the ‘production’ spec services, so would have them in a separate ‘innovation’ or ‘beta’ space. I quite like the Cornell CULLABS or the Harvard Library Innovation Lab as a model to follow.
I’ve spent a lot of the past couple of weeks looking at things around library search, partly for the RISE project (where we are looking at developing a service to provide recommendations to users searching our Ebsco Discovery Solution), and partly because I’ve been thinking about how we should be presenting search on our new website.
So we’ve been looking at examples of approaches from elsewhere, looking at what users have been saying about library search, looking at the data of how people are actually using the search tools on our current website and reading a few reports and documents. This week there have also been some interesting blog posts about the area of library search and discovery solutions, not least Aaron Tay’s recent post looking at putting LibGuides into Discovery solutions, One Search Box to search them All and Jane Burke’s series of posts about library search on InfoViews which point to areas where library search doesn’t work.
Approaches to search presentation
Looking at the approach that libraries are taking with the way they present search on their websites then there seems to be a lot of common ground. Many have adopted a tabbed approach, where different types of search are nested, (in our case a collections search for articles, a website and a catalogue search). Examples of this approach include NCSU, Michigan and Auckland. Others are using radio buttons to do something similar such as Huddersfield. Even the University of Houston Downtown which offers a single LibSearch box to search everything here offers a tabbed search on their home page. The single LibSearch search goes to their version of Summon.
In some ways it’s almost possible to see the slow steps towards a single ‘Google-like’ search for libraries as they start to get to grips with the potential of the new generation of search tools. And build up confidence that the tool is the right one for users.
So how do users feel about library search?
In last year’s library survey we had quite a lot of feedback about how students found searching for library resources. Looking at the words used to describe search in the open comments then ‘difficult’ featured prominently. At the time we were using federated search, which although an improvement on what went before, was hardly an engaging user experience. Since then we’ve implemented Ebsco Discovery Solution and are doing some work now to evaluate how users are finding it. But our presentation through the website is still a tabbed search approach that effectively says to the user ‘describe what sort of stuff you want using our language and then pick which one of these boxes might have the answer’ That seems a bit like that TV programme where you pick from a series of red boxes and have to guess what’s inside!
What are users doing?
So we’ve been looking at what searches people are putting into the search boxes to see if we can understand more about search behaviour. We’ve looked at both our discovery search and the older federated search, and our catalogue and website searches. Looking at the top 20 results for each type of search then we find that about 40% of them are identical across all the search boxes. That goes up to over 50% if you look at 3 out of 4 search boxes. The table shows the common search terms with a coloured highlight and the 3 out of 4 terms with a grey background.
The search terms that are being applied to the search boxes suggest that users don’t understand which search boxes to use, so they put them into all of them. Now that may be because users don’t understand the way we label the different search boxes or the terminology we use. If that’s the case then maybe a single search box is the way to go.
But does that then open up a presentation issue about how do you show the results? Which ones come from the website, from library resources or from the catalogue? But does it actually matter to a user? Libraries organise content into collections for a variety of reasons and I’m not convinced that users always need to know about how we organise stuff. We seem to have a whole load of things in between the ‘Have you got this?’ and ‘Here it is’ that aren’t of interest to most users. ‘What type of stuff is it?’, ‘Which collection is it in?’, ‘If it’s this type of stuff look here, or that type look there’. Is that a carry over from the concept of a ‘reference interview’ where we go through an iterative process to connect the user with the content. In which case I wonder if that is appropriate in the self-service, instant, web world?
What do you want a user to do as a result of your recommendation?
If you are offering recommendations to users then you may have some specific outcomes that you want to achieve. On Amazon the recommendations ‘people who bought this also bought that’ would firmly seem to be aiming to increase sales. I’d wonder with Amazon whether it also broadens or narrows the range of titles that are sold. Does it encourage customers to buy items they wouldn’t normally have considered? I’m sure that is true, but is it reinforcing ‘bestseller’ lists by encouraging customers to buy the same items other people have bought or is it encouraging them to buy items from backlists. Is it exploiting the ‘long-tail’ of books that are available?
There’s evidence from Huddersfield that adding recommendations to the catalogue increased the numbers of titles being borrowed. Reported here in Dave Pattern’s blog. His slides have an interesting chart (reproduced on the left) showing how the range of titles borrowed increased. So that is clearly impacting on the long-tail of stock within a library. The SALT Project with John Rylands and MIMAS is specifically looking at how recommendations might encourage humanities researchers to exploit underused materials in catalogues. SALT, like RISE, is being funded as part of the JISC Activity Data strand.
With the RISE project we are working with a narrow set of data in that the recommendations database will only contain entries for articles that have been accessed already. So there is less direct opportunity to exploit the long-tail of articles by showing them as recommendations. But our interface will be using the discovery service search so users see EDS results directly from that service alongside our recommendations, so there will be some potential broadening of the recommendations in the database.
One other aspect about recommendations that has come up is the extent to which they may be time-dependent for HE libraries. Talking through some stuff about RISE with Tony Hirst (his blog is at ouseful.info) the other week and he challenged us to think about when recommendations will be useful to a student.
We build and run our courses in a linear fashion, so students go step by step through their studies doing assignment x on subject y and looking at resources z. Then they move on to the next piece of coursework. So with recommendations reflecting what happened in the past there’s a danger that the articles students on my course have been looking at all relate to last weeks assignment and not this weeks.
So that introduces a time element. A student may be interested in what students looked at the last time the assignment was set, which may have been a year ago (for the Open University where some modules run twice a year and some run yearly it might even be a different time period from course to course). So that implies that you might want to introduce a time element into your recommendation algorithm. This would need to check the current date and relate it to the course start date, then use that data and relate it to the last time that the course was run. We discussed that you would need to factor in a window either side to cope with the spread of time that students might be working on an assignment. At the moment for us it’s a moot point as our data only goes back to the end of 2010 so we can’t make those sorts of recommendations anyway. But it’s certainly something that needs to be considered.
(the blog post title owes a lot to Alan Sillitoe’s story and film of the same name ‘the Loneliness of the long-distance runner’)
A good thing about projects is the way they lead you to question assumptions about stuff. And RISE is doing that in a number of ways. Not least my assumptions about what recommendations are. In our terms recommendation services provide information on things that are likely to be of interest to the user but the term ‘recommendations’ is already used elsewhere in libraries but with a subtly different meaning. We use the term ‘recommendation’ where we might talk about recommended reading, or articles recommended by your course tutor. And you start to get the sense that there is some form of value hierachy at work here.
When you start to look at making recommendations based on behaviour (I suppose you could call them ‘derived recommendations’ because they exist as a by-product of search behaviour) then you start to realise that there may be a difference between recommended reading, meaning something that the course suggests you read; and derived recommendations in terms of which might be of more value to the user (or maybe which has a higher ‘perceived’ value to users). Now that implies to me that there is some form of hierachy of value in recommendations. Thinking about it, then it seems to me that the value hierarchy might go something like this:
- Recommended readings – listed in your course materials – not things you have to read for your course, but things you should read;
- Recommended by the librarian – things the librarian says will be useful;
- Recommended by your peers – things that they’ve read and found useful – and that might be peers on your course, or on your social network;
- Recommended by other means – so books that seem to be similar (at the same class number on the shelf) for example.
So where do ‘derived recommendations’ fit into that hierarchy model? Well with RISE we’re looking at several different types. Recommendations based on what ‘people on your course are looking at’ (Type A), those based on other things that people looked at in the same search sessions (Type B) and recommendations based on having a similar subject to an article you’ve looked at (Type C). So Type A seems to map quite closely to 3, Type B may also map to 3 but possibly slightly lower, and Type C would seem to equate to 4.
What will be interesting in the testing will be to see how these map out in reality. What’s interesting to me is that when asked at one of our search focus groups the response to the value of recommendations was that yes they would be useful. Recommendations from people on your course were useful, but what they would really find useful was knowing what resources were being used by people who got high marks. That’s a really interesting comment and pretty challenging to be able to tackle that one in a sensible way. But it also implies that recommendations from people who previously studied a course are particularly valuable which brings into play a timescale issue around recommendations that I need to think about some more.
Wednesday saw an early start to travel to the startup meeting for the JISC Activity Data programme, so two of the RISE team were in Birmingham. It was a good opportunity to talk to the other projects in the strand and find out what they are doing. Some of the projects we were reasonably familiar with such as the UCIAD project in KMi at the OU and the Library Impact Data project being run by Huddersfield, but others we didn’t know too much about, such as the SALT project at John Rylands and Mimas and the AEIOU project in Wales.
Rather just getting each project to outline the work they were doing the Programme meeting took a different approach, so we each did a short presentation about the hypothesis that is at the root of each of the projects in the Activity Data strand. In our case our hypothesis is that
“Recommendations Improve the Search experience in new generation discovery solutions.”
We then had several sessions where we talked about aspects such as IPR, technical challenges and how we might sell the business case for using activity data to senior managers. We also heard from the synthesis project about how they planned to pull together the work of the projects. The technical discussion was particularly useful as it brought to the surface all sorts of issues around technical challenges, solutions and IP issues and approaches. Overall it was a really interesting and useful session.
One of the interesting consequences of having a hypothesis at the heart of a project is that it puts the focus very much on the investigative and experimental purpose of the project. We are not trying to build a long term solution in the six month project, but we are trying to investigate a particular aspect. In many ways Recommendations Improve the Search Experience in my mind at least, always has a question mark against it. We don’t know that it does help students, but we want to try to shed some light on how behaviour changes and what users think about the value. To me that is quite interesting because it takes the focus of the project away from the technical challenges of actually building the stuff and focuses on what you can do with it, what difference does it make and how you make that evaluation.
In the RISE project itself we’ve been making really good progress. We’ve agreed the database structure and Paul, our developer has built it and parsed all the EZProxy log files into it. We’ve log files going back to last December so have a reasonable amount of data to play with. We are really fortunate that Paul has some experience of recommendation systems so has come up with quite a few ideas about what we can do. We will blog some details about the technical approach on the RISE blog www.open.ac.uk/blogs/rise in the near future but in general we are trying to create several levels of recommendations.
- Level one is recommendations based on connections between searchers ‘people on your course looked at these articles’
- Level two creates recommendations based on relationships between documents. ‘people who looked at this also looked at that’
- Level three is subject-based recommendations. ‘if you liked this you might like that’
The next stages are to build the search interface prototypes and finalise the user metrics we want to collect to be able to determine any changes in user behaviour. We plan to explore an AB testing model by giving users different versions of the search interface and seeing how their behaviour changes. We’ve also got some work to start to build a Google Gadget version of the EDS search. So we have a fair bit to be getting on with in RISE but it’s good to be able to spend some time with looking at the wealth of largely unexploited data that we have in our systems.