You are currently browsing the monthly archive for March 2011.

What do you want a user to do as a result of your recommendation?
If you are offering recommendations to users then you may have some specific outcomes that you want to achieve.  On Amazon the recommendations ‘people who bought this also bought that’ would firmly seem to be aiming to increase sales.   I’d wonder with Amazon whether it also broadens or narrows the range of titles that are sold.  Does it encourage customers to buy items they wouldn’t normally have considered?  I’m sure that is true, but is it reinforcing ‘bestseller’ lists by encouraging customers to buy the same items other people have bought or is it encouraging them to buy items from backlists.  Is it exploiting the ‘long-tail’ of books that are available? 

There’s evidence from Huddersfield that adding recommendations to the catalogue increased the numbers of titles being borrowed.  Reported here in Dave Pattern’s blog.  His slides have an interesting chart (reproduced on the left) showing how the range of titles borrowed increased.  So that is clearly impacting on the long-tail of stock within a library.  The SALT Project with John Rylands and MIMAS is specifically looking at how recommendations might encourage humanities researchers to exploit underused materials in catalogues.  SALT, like RISE, is being funded as part of the JISC Activity Data strand.

With the RISE project we are working with a narrow set of data in that the recommendations database will only contain entries for articles that have been accessed already.  So there is less direct opportunity to exploit the long-tail of articles by showing them as recommendations. But our interface will be using the discovery service search so users see EDS results directly from that service alongside our recommendations, so there will be some potential broadening of the recommendations in the database.

Time-sensitive recommendations
One other aspect about recommendations that has come up is the extent to which they may be time-dependent for HE libraries. Talking through some stuff about RISE with Tony Hirst (his blog is at the other week and he challenged us to think about when recommendations will be useful to a student. 

We build and run our courses in a linear fashion, so students go step by step through their studies doing assignment x on subject y and looking at resources z.  Then they move on to the next piece of coursework.  So with recommendations reflecting what happened in the past there’s a danger that the articles students on my course have been looking at all relate to last weeks assignment and not this weeks.

So that introduces a time element.  A student may be interested in what students looked at the last time the assignment was set, which may have been a year ago (for the Open University where some modules run twice a year and some run yearly it might even be a different time period from course to course).  So that implies that you might want to introduce a time element into your recommendation algorithm.  This would need to check the current date and relate it to the course start date, then use that data and relate it to the last time that the course was run.  We discussed that you would need to factor in a window either side to cope with the spread of time that students might be working on an assignment.  At the moment for us it’s a moot point as our data only goes back to the end of 2010 so we can’t make those sorts of recommendations anyway.  But it’s certainly something that needs to be considered.

(the blog post title owes a lot to Alan Sillitoe’s story and film of the same name ‘the Loneliness of the long-distance runner’)

A good thing about projects is the way they lead you to question assumptions about stuff.  And RISE  is doing that in a number of ways.  Not least my assumptions about what recommendations are.  In our terms recommendation services provide information on things that are likely to be of interest to the user but the term ‘recommendations’ is already used elsewhere in libraries but with a subtly different meaning.  We use the term ‘recommendation’ where we might talk about recommended reading, or articles recommended by your course tutor.   And you start to get the sense that there is some form of value hierachy at work here. 

When you start to look at making recommendations based on behaviour (I suppose you could call them ‘derived recommendations’ because they exist as a by-product of search behaviour) then you start to realise that there  may be a difference between recommended reading, meaning something that the course suggests you read; and derived recommendations in terms of which might be of more value to the user (or maybe which has a higher ‘perceived’ value to users).  Now that implies to me that there is some form of hierachy of value in recommendations.  Thinking about it, then it seems to me that the value hierarchy might go something like this:

  1. Recommended readings – listed in your course materials – not things you have to read for your course, but things you should read;
  2. Recommended by the librarian – things the librarian says will be useful;
  3. Recommended by your peers – things that they’ve read and found useful – and that might be peers on your course, or on your social network;
  4. Recommended by other means – so books that seem to be similar (at the same class number on the shelf) for example.

So where do ‘derived recommendations’ fit into that hierarchy model?  Well with RISE we’re looking at several different types.  Recommendations based on what ‘people on your course are looking at’ (Type A), those based on other things that people looked at in the same search sessions (Type B) and recommendations based on having a similar subject to an article you’ve looked at (Type C).   So Type A seems to map quite closely to 3, Type B may also map to 3 but possibly slightly lower, and Type C would seem to equate to 4.

What will be interesting in the testing will be to see how these map out in reality.  What’s interesting to me is that when asked at one of our search focus groups the response to the value of recommendations was that yes they would be useful.  Recommendations from people on your course were useful, but what they would really find useful was knowing what resources were being used by people who got high marks.   That’s a really interesting comment and pretty challenging to be able to tackle that one in a sensible way.  But it also implies that recommendations from people who previously studied a course are particularly valuable which brings into play a timescale issue around recommendations that I need to think about some more.

A comment from someone in a meeting last week that on one of the websites mobile traffic was now 10% of all traffic sent me off to Google Analytics to check the latest position with our main website.  We’ve certainly seen a big year on year growth in mobile use, 2010 saw 8 times the number of visits from mobile devices that in 2009.  This year it looks like doubling.  But still mobile use is around 2% of visits rather than 10%.

Mobile visitsAlthough we do have a mobile website version it hasn’t been promoted heavily and even though it automatically detects mobile devices and directs users automatically to a mobile interface it considers iphones and ipads to be suitably internet capable to be directed to the standard website interface rather than a cut down version.

Digging a bit deeper into the analytics shows that ipad usage is now 50% of what Google Analytics classes as ‘mobile’ use (up from 38% last year).  Based on the first two months of this year ipad usage looks to be up by three times, while non-ipad moble use looks to be increasing by about 20%.  Whilst we are working on a new mobile version of the drupal website we aren’t planning an ipad app version.

What intrigues me is whether ipads really are mobile devices for websites.  The safari browser is perfectly functional (flash inabilities notwithstanding), and although some sites direct you to mobile versions (or like google docs give you the option) it’s a purpose built internet browsing machine.   This year there are dozens of tablet-type devices being launched with a variety of different operating systems.  iPads already seem to be coming up as the ‘mobile’ device most likely to be using our website, internal use plays a part in that.  So it implies for me that we need to be a bit more selective in how we define mobile use (and maybe so should Google Analytics) and split the mobile category into tablet use of the full website and mobile use of the mobile version.

Wednesday saw an early start to travel to the startup meeting for the JISC Activity Data programme, so two of the RISE team were in Birmingham.  It was a good opportunity to talk to the other projects in the strand and find out what they are doing. Some of the projects we were reasonably familiar with such as the UCIAD project in KMi at the OU and the Library Impact Data project being run by Huddersfield, but others we didn’t know too much about, such as the SALT project at John Rylands and Mimas and the AEIOU project in Wales. 

Rather just getting each project to outline the work they were doing the Programme meeting took a different approach, so we each did a short presentation about the hypothesis that is at the root of each of the projects in the Activity Data strand.  In our case our hypothesis is that

“Recommendations Improve the Search experience in new generation discovery solutions.” 

We then had several sessions where we talked about aspects such as IPR, technical challenges and how we might sell the business case for using activity data to senior managers.  We also heard from the synthesis project about how they planned to pull together the work of the projects.  The technical discussion was particularly useful as it brought to the surface all sorts of issues around technical challenges, solutions and IP issues and approaches.   Overall it was a really interesting and useful session.

One of the interesting consequences of having a hypothesis at the heart of a project is that it puts the focus very much on the investigative and experimental purpose of the project.  We are not trying to build a long term solution in the six month project, but we are trying to investigate a particular aspect.  In many ways Recommendations Improve the Search Experience in my mind at least, always has a question mark against it. We don’t know that it does help students, but we want to try to shed some light on how behaviour changes and what users think about the value.  To me that is quite interesting because it takes the focus of the project away from the technical challenges of actually building the stuff and focuses on what you can do with it, what difference does it make and how you make that evaluation.

In the RISE project itself we’ve been making really good progress.  We’ve agreed the database structure and Paul, our developer has built it and parsed all the EZProxy log files into it.  We’ve log files going back to last December so have a reasonable amount of data to play with.  We are really fortunate that Paul has some experience of recommendation systems so has come up with quite a few ideas about what we can do.  We will blog some details about the technical approach on the RISE blog in the near future but in general we are trying to create several levels of recommendations.

  • Level one is recommendations based on connections between searchers ‘people on your course looked at these articles’
  • Level two creates recommendations based on relationships between documents.  ‘people who looked at this also looked at that’
  • Level three is subject-based recommendations.  ‘if you liked this you might like that’

The next stages are to build the search interface prototypes and finalise the user metrics we want to collect to be able to determine any changes in user behaviour.  We plan to explore an AB testing model by giving users different versions of the search interface and seeing how their behaviour changes.  We’ve also got some work to start to build a Google Gadget version of the EDS search.  So we have a fair bit to be getting on with in RISE but it’s good to be able to spend some time with looking at the wealth of largely unexploited data that we have in our systems.

Twitter posts



March 2011

Creative Commons License