To Birmingham at the start of last week for the latest Jisc Library Analytics and Metrics Project ( Community Advisory and Planning group meeting.  This was a chance to catchup with both the latest progress and also the latest thinking about how this library analytics and metrics work will develop.

At a time when learning analytics is a hot topic it’s highly relevant to libraries to consider how they might respond to the challenges of learning analytics. [The 2014 Horizon report has learning analytics in the category of one year or less to adoption and describes it as ‘data analysis to inform decisions made on every tier of the education system, leveraging student data to deliver personalized learning, enable adaptive pedagogies and practices, and identify learning issues in time for them to be solved.’

LAMP is looking at library usage data of the sort that libraries collect routinely (loans, gate counts, eresource usage) but combines it with course, demographic and achievement data to allow libraries to start to be able to analyse and identify trends and themes from the data.

LAMP will build a tool to store and analyse data and is already working with some pilot institutions to design and fine-tune the tool.  We got to see some of the work so far and input into some of the wireframes and concepts, as well as hear about some of the plans for the next few months.

The day was also the chance to hear from the developers of a reference management tool called RefMe (  This referencing tool is aimed at students who often struggle with the typically complex requirements of referencing styles and tools.  To hear about one-click referencing, with thousands of styles and with features to intergrate with MS Word, or to scan in a barcode and reference a book, was really good.  RefMe is available as an iOS or Android app and as a desktop version.  As someone who’s spent a fair amount of time wrestling with the complexities of referencing in projects that have tried to get simple referencing tools in front of students it is really good to see a start-up tackling this area.

I was intrigued to read David Weinberger’s blog post ‘Protecting library privacy with a hard opt-in’  in it he suggests that there is a case to be made for asking users to explicitly opt-in to publishing details of their checkouts (loans) before you can use that activity data.  I must admit that I’d completely missed the connection between David Weinberger author of ‘Everything is miscellaneous’ and his role with the Harvard Innovation Lab and I’m sure I’ve probably blogged about both in the past.

The concern that has been raised is about re-identification, where supposedly ‘anonymous’ datasets can be combined with other data to identify individuals.  There’s a good description of the issue in this paper from 2008 from Michael Hay and others from the University of Massachusetts

Obviously an issue of this type is of critical significance when you might be talking about medical trials data for example, but library data might also be personal or sensitive.  Aside from the personal aspects you could also imagine that a researcher carrying out a literature search for material for a potential new research area would not want ‘competitors’ to know that they were looking at a particular area, particularly now cross-domain research activities are more common.

The issue of anonymity and potentially being able to identify an individual from their activity data is an area that has been explored through a number of projects, such as in Jisc’s Activity Data programme and synthesis project outputs at  particularly in the section on data protection.   Most of the approaches tackled anonymization in two ways, by replacing user IDs with a generated ID (described interestingly by Hay as ‘naive anonymization’) and by removing data from the dataset if there were only small numbers of users included (such as a course with only a few students enrolled).

Re-identification techniques seem to work by being able to identify unique patterns of use, called digital fingerprints that can be used to identify individuals.  When you combine data from an anonymized dataset with other material you can start to identify individuals.  It certainly seems to be something that needs to be thought carefully about when contemplating releasing datasets.

Is the suggested solution, of asking for explicit permission the right approach?  If you are planning to release data openly, I’d probably agree.  If you plan to use it only within your systems to generate recommendations, then yes it’s probably good practice. I worry slightly about the value of the activity data if there is a low opt-in level.  That may significantly diminish its value and usefulness.

I’m not too convinced though about the approach that says that users agree to a public page that lists your activity.  That would seem to me to encourage people who might not be unhappy with allowing their data to be used unattributed in recommendations not to opt-in.  When we’ve asked students about their views of what data we should be able to use they were quite happy for activity data to be used.   My view would be that it’s fine to show an individual what they have used (and we do that), but not something to share.

To Birmingham today for the second meeting of the Jisc LAMP (library analytics and metrics project) community advisory and planning group. This is a short Jisc-managed project that is working to build a prototype dashboard tool that should allow benchmarking and statistical significance tests on a range of library analytics data.

The LAMP project blog at is a good place to start to get up to speed with the work that LAMP is doing and I’m sure that there will be an update on the blog soon to cover some of the things that we discussed during the day.

One of the things that I always find useful about these types of activity, beyond the specific discussions and knowledge sharing about the project and the opportunity to talk to other people working in the sector, is that there is invariably some tool or technique that gets used in the project or programme meetings that you can take away and use more widely. I think I’ve blogged before about the Harvard Elevator pitch from a previous Jisc programme meeting.

This time we were taken through an approach of carrying out a review of the project a couple of years hence, where you had to imagine that the project had failed totally. It hadn’t delivered anything that was useful, so no product, tool or learning came out of the project. It was a complete failure.

We were then asked to try to think about reasons why the project had failed to deliver. So we spent half an hour or so individually writing reasons onto post-it notes. At the end of that time we went round the room reading out the ideas and matching them with similar post-it notes, with Ben and Andy sticking them to a wall and arranging them in groups based on similarity.

It quickly shifted away from going round formally to more of a collective sharing of ideas but that was good and the technique really seemed to be pretty effective at capturing challenges. So we had challenges grouped around technology and data, political and community aspects, and legal aspects for example.

We then spent a bit of time reviewing and recategorising the post-it notes into categories that people were reasonably happy with. Then came the challenge of going through each of the groups of ideas and working out what, if anything, the project could or should do to minimise the risk of that possible outcome happening. That was a really interesting exercise to identify some actions that could be done in the project such as engagement to encourage more take up.

A really interesting demonstration of quite a powerful technique that’s going to be pretty useful for many project settings. It seemed to be a really good way of trying to think about potential hurdles for a project and went beyond what you might normally try to do when thinking about risks, issues and engagement.

It’s interesting to me how so many of the good project management techniques work on the basis of working backwards. Whether that is about writing tasks for a One Page Project Plan based on describing the task as if has been completed, e.g. Site launch completed, or whether it is about working backwards from an end state to plan out the steps and the timescale you will have to go through. These both envisage what a successful project looks like, while the pre-mortem thinks about what might go wrong. Useful technique.

Kritikos search interface screenshotI noticed an interesting Jisc-funded project at Liverpool today that I hadn’t previously heard about (blogged by Jisc today) that talked about a method of sharing resources amongst students using a crowdsourcing approach.  The service is called Kritikos and takes several quite interesting approaches.  At the heart of the system is some work that has been done with students to identify resources relevant to their subjects (in this case Engineering) and also to identify results that weren’t relevant (often because some engineering terms have different meanings elsewhere – e.g. stress). That’s an interesting approach as one of the criticisms I’ve heard about discovery systems is that they struggle to distinguish between terms that are used across different disciplines (differentiation for example having separate meanings in mathematics and biology).

The search system uses a Google Custom Search Engine but then presents the results as images which is a fascinating way of approaching this aspect.  Kritikos also makes use of the Learning Registry to store data about students interactions with the resource and whether they found them relevant or not.  It seems to be a really novel approach to providing a search system that could go some way to address one of the common comments that we’ve been seeing in some work we’ve been doing with students. They feel that they are being deluged with too much material and struggle to find the gold nuggets that give them everything they want.

Kritikos looks to be particularly useful for students in the later stages of their degrees, where they are more likely to be doing some research or independent study.  One of the things that we are finding from our work is that students at earlier stages are less interested in what other students are doing or what they might recommend.  But possibly if they were presented with something like Kritikos they might be more inclined to see the value of other students’ recommendations.

Lorcan Dempsey’s slides and video from the ‘Squeezed Middle’ have now been made available.  The slides are on the OCLC website here and the video of the presentation is on YouTube here and embedded below.

The video was used during the ‘Squeezed Middle’ workshop to introduce the initial piece of work to look at trends in terms of Collections, Space, Systems and Expertise/Services. 

Reflections on the presentation
The contrast between between libraries that grew up at an institutional scale and now face challenges from organisations that are the product of the network webscale environment was interesting to hear articulated in this way.  Drawing lessons from how other industries have had to adapt, Lorcan referenced John Hagel’s ‘Unbundling the corporation’ paper from Harvard Business Review from 1999 and talked about the trend towards greater specialisation.  Talking about three elements of customer engagement, innovation and infrastructure, and offering up a number of interesting examples from the University of Michigan and elsewere, Lorcan offered the view that priorities for libraries should be around engagement and innovation, with reducing effort going into infrastructure. 

I was quite interested to hear the Discovery solutions characterised as ‘data wells’ with an intriguing question about what other content aggregators such as Thomson and Elsevier might do.  Picking up on the point about a key factor being ‘disclosure’ of your content to the network-scale services such as Google Books and Google Scholar (with a comment about 75% of Minnesota’s SFX requests coming from outside the institution), it does make me wonder what the longer-term role might be of the current generation of ‘Discovery’ services.

Following on from the JISC/SCONUL ‘Squeezed Middle’ workshop that I blogged about earlier.  Paul Stainthorp has blogged about his experience and included the paper he presented on his blog here.  Ben Showers, from JISC, has also blogged about the event here on the JISC Digital Infrastructure Team blog.   Links to Ken Chad’s [update to the update: now available here] and David Kay’s provocations/presentations and Lorcan Dempsey’s video are also promised.  There’s also a useful list of the priorities that came out of the workshop, put together by David Kay, here.  This list sets out the priorities in five different areas: ebooks, non-traditional assets, end-user applications, library roles and above campus services.

New JISC call
One of the motivations behind the workshop was to help to inform (both JISC and the HE community) about a new JISC call (12/01) that includes a couple of LMS strands.  One covers a project to create “a new vision for the future of library systems and a ‘roadmap’ for the delivery of that vision”.  There certainly seems to be a lot more activity in the LMS systems area at the moment with new products, open source solutions and shared systems.    The second strand covers a set of “pathfinder projects to investigate a broad range of potential new models and approaches to library systems and services”.  The themes within this area cover Shared library systems, emerging tools and technologies and emerging library systems opportunities.  There are quite a wide range of different aspects touched on in the call paper, ranging through reference management to data.  A lot of potential for some interesting ideas to emerge.

Did you ever get that feeling…
When you stand up to give a presentation at a workshop and then realise that there are several people in the room that had heard you give a presentation on the same subject yesterday.  And then realise that actually, two of them had also heard you talking about exactly the same subject on Monday.  Well, that was Wednesday last week, talking about our RISE Activity Data project, for the third day in a row.

They were all slightly different presentations reducing in time from 30 minutes down to 5 minutes and starting with 60 slides on Monday, reducing to 20 on Tuesday and 2 on Wednesday.  I’d always wondered whether ‘death by Powerpoint’ referred to the audience or the presenter.  So the number of slides was going in the right direction.  But nonetheless it was a good chance to talk to people about some stuff we’re doing with Activity Data, and as always with these type of events it was good to hear about other work that is happening, meet different people, and learn new things.

One of the interesting things of any project is the unexpected directions that they take you in.  The whole e-journal industry/technology thing is something that hadn’t really figured in my IT work in public libraries.  We’d just about started to think about OpenURL routers as we started to build e-resource collections, but it has definitely been a learning curve to get up to speed with these systems, and I’m still learning, so work in this area is helping build my knowledge.  So apologies for any statements of the blindingly obvious for anyone who has been working in this area for a long time!

It’s my metadata (or you can look but not touch…)
One thing that intrigues me is the protective attitude to bibliographic metadata.  Surely its role is to advertise your product? So why would you limit what people can do with it?   It is something that is exercising us in the RISE project but seeing JISC’s call this week for metadata to be open made me wonder about what would encourage publishers who supply metadata about their content to be more open with it.

I’ve already wondered about why publishers don’t provide their metadata to Discovery solutions  but that is clearly just one aspect of the issue.  I gather from comments that more are providing their metadata to discovery solutions. But having provided metadata into discovery solutions then doesn’t that make the case that metadata is about advertising your material?  It’s like advertising in the yellow pages isn’t it?  You advertise your content so that users can find it, so wouldn’t you want that metadata to be spread far and wide?

I can understand that when publishers and aggregators created Abstracting and Indexing services then there was value in that metadata as you needed those tools to find the content and could make money out of those services, but in the world of discovery services then hasn’t the game changed fundamentally?  If publishers made their metadata available more freely for other people to create their own innovative services, such as recommenders, then that publishers content is more likely to be recommended, and more likely to be used, and therefore be cited more frequently and have increased value?

New activity data project
I blogged back in November on some reflections on preparing and submitting a funding bid.  Well, we’ve recently received word that we are one of eight projects to be funded as part of the JISC programme and we’re really pleased to be successful.  It’s a fairly short six month project but is working in an area that I find particularly interesting, that of user activity data and how it can (and should be used) to improve services.

Our new project RISE is part of the JISC Activity Data programme and the projects cover quite a wide range of data from VLEs, library systems and other corporate systems so it will be really interesting to see what comes out of them.  We’re going to be looking at data from our ezproxy system which we use as the primary method of getting students and staff access to licenced library resources.   As most of our use is remote (unsurprisingly as I work at a distance learning institution) it’s a critical system for us.  We’ve recently put in place a new aggregated search system (Ebsco Discovery Solution) and we are going to explore how we can use our ezproxy search data to provide recommendations to searchers and see what difference it makes to their behaviour.  Using data to provide recommendations are an everyday part of commercial life and taken for granted if you’re a regular user of services such as Amazon.  But libraries (with notable exceptions such as Huddersfield) have been strangely reluctant to exploit the rich store of activity data that they can collect to improve the user experience.

RISE gives us the opportunity to see what we can do with the data in terms of making recommendations, what the challenges are, what we can’t do with it, and whether it is of interest to anyone else.  It’s going to be an interesting few month. RISE will shortly have it’s own project blog and twitter tag #ourise and I’m aiming to blog some personal reflections and notes on how we get on here.

Two things this week made me think about how the HE library landscape might be changing in the next few years.  One, was the SCONUL Shared Services event, the other was Lorcan Dempsey’s keynote presentation from the emtacl conference in Norway that is now on the web (

The connection between the two was that both saw a potential future for libraries in the networked environment where shared services had a part to play.  One of the comments that is sometimes made in relation to shared services in HE is that things are different in HE because HE institutions are competing with each other.  I’d question how much of the institution’s competitive edge the library represents and I’d argue that the distinctive elements are more likely to be in their customer service, expertise or collection quality rather than their systems.

Amongst his remarks on the network effect and how much that will lead to changes Lorcan pointed to examples from the commercial world where companies like Netflix buy-in services from Amazon who are a competitor, because of Amazon’s expertise, which represent the best available.  In the library world Lorcan saw much duplication of activities around libraries with “redundant, complex systems apparatus will have to be simplified” with “a move to shared infrastructure in the cloud”

The SCONUL Shared Services event covered a model for how Shared Services might work for HE libraries.  The day was spent talking about the Shared Services study Business Case that was submitted to Hefce towards the end of 2009 requesting pathfinder funding. (

SCONUL Shared Services domain model

In brief the report proposed a three phase set of developments:

1.  the creation of a national-level managed ERM system that would manage national level subscriptions to resources and a national ERM licensing service – rather than each HE library having their own ERM processes

2. the creation of a Discovery to Delivery service that would combine the national ERM entitlement data with authentication and search services at a national level

3. removing duplication with library management systems, using the national-level authentication infrastructure and inter-operating with institutional systems

The expectation is that savings would be made by reducing duplication in licensing and rights management, by saving on the cost of e-licence deals by negotiating national subscriptions (going beyond the opt-in model) – so individual HE libraries would do less rights/licensing work, there would be savings in the cost of local ERM systems, licensing staff and Search/LMS costs/LMS support staff.

Unfortunately Hefce have not approved the request for pathfinder money so SCONUL are looking at what options there are to move this forward.  There was the feeling at the meeting that the ERM element could be an achievable step, although a lot of detail still needs to be sorted out.  The suggestion was made that progress could be made if enough HEIs were prepared to contribute a small amount to getting it off the ground.    The general view was that we should be doing this, but probably more realistically not as a single large project but as a step by step process.  JISC and SCONUL are keen to move it on but it isn’t all that clear how it might be funded.

Thinking about the proposals there are a few things that strike me about it:

  • I wonder about the realism of the timetable – mainly in relation to whether this can happen quickly enough given the likely scenario of major cuts in funding for the sector.
  • I must admit to a slight sense of déjà-vu – having sat through a lot of the MLA Stock Services review in public libraries – and seen that go from proposals for shared services to something that just ran into the ground – then I’m interested to see how the HE library sector tackles something like this and whether it has any more success with instituting such a major network-scale change.
  • Others are a lot more qualified to comment on the technical practicality of some of the developments but I find it strange that while there are several examples of shared library management systems (LLC, SELMS for example), there are few in HE these days.

It will be interesting to see what happens over the next few months and what opportunities there are to get involved.

Feedback from the JISC ‘Modelling the Library Domain’ workshop in June is now available on the web at A few interesting comments amongst the feedback about the definitions.

Some good suggestions about next steps, particularly about providing some more explanation and clarification/exemplars to make it easier for people to understand the applicability of the Model

