You are currently browsing the tag archive for the ‘library’ tag.

I had the opportunity to go and listen to Martin Weller (@mweller on twitter) and Nick Pearce (@drnickpearce) talking about their work on Digital Scholarship this morning.  I’d put together some thoughts last year on an earlier blog post – Digital scholarship and the challenges for libraries – so it was good to get an update on how the work is moving forward.

Digital Scholarship context
Nick Pearce set the context for Digital Scholarship with a short presentation – available on slideshare here.  Looking at technology first he set out the view that books and language could be viewed as ‘technologies’.   Books as a technology wasn’t too contentious for a room full of library people.  Language as a technology is a bit more of a stretch but if you view it as a tool to enable change in a community then it’s a good analogy.   His comment that ‘old technologies often persist – for good reasons’ was particularly interesting and the classic example is radio continuing alongside TV.   But I’d wonder if these two technologies are fulfilling exactly the same role or whether they have established different roles for themselves. 

Turning to digital technologies he pointed to the large number and wide variety of services.  Using Ludwig Gatzke’s image of the incredible range of web 2.0 services as an illustration of how this year’s favourite technology is next year’s history.  Many of the services shown in the image no longer exist and the list doesn’t show services such as twitter that are currently very popular. 

That points to a real risk that you choose to adopt a technology platform that turns about to be transient or you find that ‘a year later everyone has moved on’.

Nick then looked at some of the key features of the digital environment and suggested that only a small number of users were actually creating content (which gives me pause for thought given the enormous growth that sites like YouTube are experiencing with user-generated content), and that you are relying on sites that are in perpetual evolution, effectively constantly in beta-testing.


Technologies, issues and challenges
Turning to scholarship and using Boyer’s “Scholarship reconsidered” model we started to briefly look at what technologies, issues and challenges might present themselves for the four elements of Boyer’s model.

  • Discovery
  • Integration
  • Application
  • Teaching

Ideas that came up include the ever-increasing amount of data (data deluge), challenges in economics and funding, and issues around social networking.   Nick went on to give some examples of Open Data (e.g., Open Publishing (the Open access movement), Open engagement through blogs and twitter feeds from people such as Richard Dawkins, and Open education (Open Learn and OERs).

“the Open Scholar is someone who makes their intellectual projects and processes digitally visible and who invites and encourages ongoing criticism of their work and secondary uses of any or all parts of it–at any stage of its development.”  Academic Evolution

Digital Scholarship work
Martin Weller then took us through the work that is being carried out to investigate digital scholarship.  This comprises three elements:

  • promote digital scholarship
  • work on recognition 
  • research current practice

It was interesting to hear of the work to create a new digital scholarship hub DISCO that is being launched shortly, and good to get a brief preview of it.  Martin talked about his aim to formulate some kind of ‘magical metric to measure digital scholarship’ and it would be interesting to see how this sort of scoring system could be used – take the scorecard along to your appraisal with the results?   Aims included trying to decide what represented good public engagement and working on case studies that academics could use as part of their promotion case. 

Martin briefly covered some of the issues around digital scholarship including issues around rights, skills, plagiarism, time and quality/depth.  We then spent a little time looking at issues, benefits and what we’d like to change.  The sorts of things that our group talked about included: difficulties of getting people to engage; the lack of awareness of what the technology can do and concerns about quality in comparing peer-reviewed journals with blogs, for example.  For the library we thought there was a fit around the library increasingly focusing on electronic rather than print resources but there are challenges around managing and curating access to material in social networking environments that may be ephemeral.   The issue of persistent identifiers to this type of material is a real concern.

Finally, in an all too brief session, Martin flagged up the JISC Digital Scholarship ‘Advanced Technologies for Research’ event on 10 March 2010.

It was interesting that the presenters had slightly different perspectives on Digital Scholarship.   It would have been good to have a bit more time to talk through some of the discussions and have more feedback, but time was a bit limited.  It is fascinating to hear at first hand some of the work that is taking place to map out equivalencies between traditional academic practice and potentially new academic practices.  It would be good to get some of the counter-arguments as to why some people don’t think that blogs and suchlike are equivalent to traditional practice.

For libraries the issues are especially around discovery and providing access to the material.  A colleague made the point that librarians can’t evaluate the content in a blog as they don’t have the subject knowledge.  At present evaluation of resources is as much down to evaluating the quality of the publishing medium, e.g. it’s in Nature or a reputable resource so it should be appropriate.  With blogs librarians don’t have that context to use.

And the other big issue for libraries is persistence of links.  A whole technology industry has grown up around these problems e.g. SFX, OpenURLs, DOIs etc etc and work is going to be needed to work out the implications of content migrating from a few hundred aggregrated collections of peer-reviewed academic journals to many thousands of individual resources in the cloud.  But maybe this is where technologies such as Mendeley come in?

The rise of the datastore
The last year or so has seen the growth of a new type of resource available on the web, the “datastore”.   Datastores are collections of data, generally, but not necessarily government data, although usually authoratitive.  Examples include DataSF from San Fransisco, Chicago City Data, the UK Government datastore, the London datastore and the Guardian newspaper’s datastore.

The defining aspect of datastores is that they provide ‘raw data’ collected together in one place, rather than being spread across many different government and other websites.   That raw data can cover a wide variety of subjects, from mortality statistics and Indices of Deprivation, through to ‘how many miles of high-speed railway’ and FTSE100 Directors’ pay.  Generally the data is presented in the form of tables, often in Excel or CSV format (or exportable in those formats).

Alongside the benefits of having the data collected together in one place, and in many cases having data that has never been made available publicly, datastores offer the potential to start to present and analyse the data through visualisations and data mashups – by combining data from more than one source and exposing connections.  There are a few examples on Tony Hirst’s blog and an example below using Many Eyes of a visualization of London population.

london population visualization

London population visualization










Challenges for libraries and librarians
Although in terms of discovery, datastores help by collecting together relevant data in one place, such as on , datastores do still present some specific challenges for libraries and librarians.   There are still some discovery challenges, but I would consider that the biggest challenges are around librarians getting to grips with exploiting the data within the datastores.

The challenges in this area are about finding the datastores and understanding what is contained within each datastore.  But these are skills that librarians are used to using to find and assess resources so shouldn’t present much of a challenge.  Techniques such as building Google Custom Search engines to search datastores can help with finding relevant data within these resources.

Custom Search engine datastores Building a custom search engine to search the London, UK Government and Guardian datastores is fairly straightforward, so I’ve built a quick example at

Using this form of search engine makes it simple to discover which datastores have datasets that may be of interest.



Exploiting datastores
Where I think it starts to become more difficult for libraries is in exploiting the data in the datastores.  There is a question here about the role of the librarian.  Is the role to just find the data, check its quality and promote it to academics and students?, or is there a role to help users to find ways of using the data?  The latter role implies a much deeper understanding of how the data can be used, not just being able to export the data in a spreadsheet and produce a nice visualization, but also to know how to use APIs to dig into datastores, to use tools such as Yahoo Pipes to take data and transform it.  The question is how much librarians and libraries see that as their role, and how much do they see their role as being that of supporting students and academics in exploiting the data, by learning and teaching the techniques to understand and exploit the data.

Obviously some librarians are more comfortable playing around with data than others, but the interest among librarians in the Mashed Library events indicates that a growing number of librarians are starting to appreciate that this is an area relevant to libraries.   But over the years libraries and librarians have had to get to grip with several generations of new technological innovations, from CD-ROMs, through the world wide web to RFID and in each case librarians have taken on board new skills to exploit the new technologies and help their users.

Thinking about the different ways that libraries could be making use of user-activity data and it seems to me that there are two distinct categories:

  • Internal or indirect – i.e. using the data to improve library services (e.g. looking at loans data to show value and use of stock – Evidence-Based Stock Management), or to assess Value for Money (e.g. cost of resources being used)
  • Direct – i.e. using the data to help users make more informed choices when using library services. (e.g. users on your course borrowed these items)

Internal or indirect use of user-activity data
Apart from loans data libraries might also have access to data from OPAC and other search systems, Ezproxy-type systems, or systems such as SFX.  In many cases there are institutional VLEs that also track user-activity.   For academic libraries the key to making use of much of this data is being able to identify the activity with a particular course.

Ideas around using this type of data include being able to assess the Value for Money of e-resources by breaking down the use by course.  Such an approach might also lead to some interesting ideas around cost and charging models. For a library showing the print and online resources that are ‘consumed’ by an individual course could go some way to helping libraries prove their value. 

With e-resources libraries are tied in to licence deals that don’t really take account of actual levels of use of those resources other than at a crude ‘size of institution’ level.  Accumulating data of actual use of those resources by different courses showing peaks/troughs and the maximum number of concurrent users could help the library sector in negotiating better licence terms.

User-activity data can also be used to identify weaknesses and deficiencies in services, such as investigating search terms that are used on OPACs and websites as sometimes they can indicate areas that need addressing with more relevant content or metadata.

Direct use of user-activity data
There are several areas where user-activity data can be used to provide direct services to users.  Search results could be analysed by course and fed back to people on that course ‘People on your course are searching for this, using these databases, borrowing these books’.

A key point is to feed that data back into wherever your users are accessing the library service, whether that is the OPAC, VLE or elsewhere through widgets or gadgets.    Getting the data fed back in real-time is the ultimate challenge as too many systems still rely on batch processing of data.

Loans data from other HEIs is also valuable particularly if it can be mapped to your own courses so your students can see what books students on similar courses elsewhere are borrowing.

One area where user activity data might help is in providing support to people using e-resources.  Users often seem to struggle understanding what databases to use.  Taking database search terms and the results and then building some form of success or ranking system might enable us to build a more intelligent system that could guide users to appropriate resources by indicating what people searching ‘Web of Science’  for example, were searching for; or by taking their search query and suggesting suitable databases.

Resource or Customer?
One final thing strikes me.  Libraries have long made use of statistical data as a decision-making tool but in the main have tended to use data at a category level (e.g. books on a particular subject, or at a particular class number range).  Data also tends to be analysed on a location and time basis,  e.g. books on these subjects are popular in this library or less popular at that library.   I think you could perhaps categorise this as a ‘resource-centric’ view of usage data.  What happens with the resource is what libraries are concerned with.

In contrast the retail sector seems to take a different approach that you could perhaps categorise as a ‘customer-centric’ view of their user-activity data.  What is the customer buying?, what will encourage the customer to buy x?  There is still a time and location element, with different patterns in different stores. 

For libraries, this would need us to start to look at the pattern of our customers use and use the data to predict what they might be interested in, from their previous loan or search patterns.  If they are following a course, be able to push relevant resources to them at the appropriate time in much more of a real-time fashion.

Using evidence of what customers are doing to shape personalised services is now common practice across the business and commercial sectors. From giving people special offers based on their loyalty card transactions to the ‘Customers who bought this also bought this’ of Amazon, many companies are exploiting user activity data in numerous ways.

Libraries, though, have been very slow to realise the potential of activity data. With modern Library Management Systems now recording and keeping transactions for many years, and with the growth of systems such as ezproxy to control access to eresources, Libraries now have a rich seam of user activity data. But, with the exception of a few people such as Dave Pattern at Huddersfield, the take up in the HE sector hasn’t been high.

[Interestingly there is some use of this data in the public library sector. Evidence Based Stock Management takes details of library loans and uses that data to build a set of reports to allow librarians to make decisions on stock purchasing. In times where public library stock funds are under severe financial pressure (and from personal experience that has long been the case), being able to have evidence of stock performance is a powerful value for money, stock quality and customer service tool.]

To investigate the potential JISC have funded the MOSAIC project (Making Our Shared Activity Information Count) MOSAIC ‘is investigating the technical feasibility, service value and issues around exploiting user activity data, primarily to assist resource discovery and evaluation in Higher Education.’ MOSAIC builds on some of the work undertaken as part of the JISC TILE project and has used the user-activity data from Huddersfield University to build a demonstrator system and as the basis for a developer competition to identify some possible ways user activity data can be used. The project has also been looking at issues around data protection and getting permission for data to be made available for reuse.

I was able to attend one of the workshops the project held earlier this month to look at the work so far and the competition entries. The workshop aimed to get feedback from a range of librarians and academics. It was good to get to look in detail at the competition entries and talk about the work of the project. There were a total of six competition entries:

Interestingly three different approaches were taken with essentially the same set of data. The three approaches covered different ways of search, tools to show value for money and using the data to provide information about particular courses. Given the limited data available and the short timescale to develop a new service it was encouraging to see so many different approaches. The MOSAIC project is in the process of writing up their project report and it will be interesting to see what the outcomes will be.

In the next blog post I hope to cover some of the ways that libraries can make use of user-activity data to improve or deliver new services.

Feedback from the JISC ‘Modelling the Library Domain’ workshop in June is now available on the web at A few interesting comments amongst the feedback about the definitions.

Some good suggestions about next steps, particularly about providing some more explanation and clarification/exemplars to make it easier for people to understand the applicability of the Model

Listening to a thought-provoking presentation by Martin Weller recently  on ‘Blogging and Academic Identity’ raised a few questions about the challenges that digital scholarship sets for libraries.

Broadly, the idea is that some academics are increasingly building an academic identity using social networking tools.  Essentially they are building academic networks in the web 2.0 environment and are starting to publish to the web as an alternative to traditional academic publishing models.

For libraries a model of academic output that uses blogs, slideshare and flickr as much as peer-reviewed learned journals represents a major challenge and potentially leads to a fundamental rethinking of how libraries approach the management of resources.  Some of the issues include:

  • the potentially ephemeral nature of blogs, forums and other shared-spaces
  • how do librarians evaluate the value of material published in ‘social academia’?
  • how do librarians track down relevant links in blogs and evaluate the significance?
  • what about conversations that take place in shared spaces amongst networks of academics?
  • who is first to cite a twitter-stream? as  evidence of thoughts or concepts

The potentially ephemeral nature of web content in the social networking arena brings to mind the world of antiquity where our knowledge of some of the great ‘ lost works’ of ancient literature are only known to us by references to them in later works.   A curious situation for the 21st century

Ideas around self-service seem to be very much in vogue across much of the library sector.  Even in libraries where the number of circulation transactions is low the idea of self-service is very much at the front of managers minds. Having spent a large part of the last two years looking at Self-service from the perspective of public libraries it is interesting to start to look at self-service from a different perspective.

For the public library RFID is the preferred form of technology.  With high loan rates the initial cost of tags, tagging and self-service equipment can be justified against the large amount of staff time spent on manual intervention, in terms of modernising services, extending opening hours or reusing staff time more effectively.    Having procured and implemented RFID systems, converted half a dozen libraries to RFID and worked with the cultural changes, quite a few thoughts and issues come to mind which it may be of interest to set out here:

  • Although library systems have changed over the years – library processes essentially have stayed the same – customer takes books to a member of staff -> who does something with the books -> and gives them back to the customer – that has remained the same through Browne, Photocharging, data capture systems and online LMSs
  • By now it is pretty well understood that the cost of systems is easily double the cost of the hardware/software/tags/tagging – take into account changes to the library (and there are many), cabling, power, new furniture, systems configuration, costs of licences/linking to LMS, staff training
  • Self-service is a big cultural change for library staff – involve them in the process and plan to train them in how to work in the new environment – don’t underestimate the sense of dislocation that no longer having a counter to work at leaves
  • Tagging items costs pretty much the same as the tag – use specialist teams if you can – you may/may not want to manage them yourself but you need to be sure to quality control and check their work
  • Don’t underestimate the need to change your routine processes – go through them all carefully – start/end of day processes, what happens with money, returned stock, when tagging is security turned on or off?
  • Connections to the LMS need to be carefully tested – leave a lot of time for this – make sure you test all the transactions and combinations.  Particularly test things that are different or have unusual setup
  • When redesigning your self-service library you need to direct customers to the self-service equipment rather than the counter – take the counter out – replace it with a small desk away from the main traffic flows – 90% self-service is achievable – but it has to be seen as the main means of borrowing/returning – train staff to get them to show customers how to use RFID
  • Be aware that some RFID units have a wider read range so can pick up items sited too close to the devices

For the public library sector RFID has very much been the preferred direction.  There are some issues: the cost is high – (both initial capital costs and on-going revenue costs); and, some customers dislike self-service, much as they dislike any form of technology –  (but many will embrace it, and the majority do accept it).

In reality it is being used as a means of reducing the number of staff needed to staff a library at any one time (which  works at medium/larger libraries or if you plan small entirely self-service libraries).  Although it can pay for itself by reducing the staff cost over five years it remains costly technology and the life expectancy of the tags and equipment is still unclear.  Finding the budget to replace Library Management Systems has often been tricky, finding larger amounts of budget to replace RFID systems may be yet more challenging.

That being said, RFID technology is helping to change the way libraries are run.  It is helping to drive the modernisation of libraries and getting people to think about improving the look of libraries and considering how many of the traditional library rules and regulations are relevant.    Looked at optimistically it does free staff from the day-to-day manual loan/return processes  and allow the possibility of more time to spend in other library activities.

JISC Modelling the Library Domain workshop

Interesting workshop, learnt a great deal about what things HE librarians are thinking about – JISC didn’t seem too confident that the terminology – realms, corporation, channel and client is right.

  • Realms – the concept of the ‘Public Realm’ as an example of where the terminology has been used recently wasn’t really picked up – references to historical realms which don’t really help
  • Corporation – most difficulty with this concept – and it may not have helped that the scenarios all focused around cataloguing
  • Channels – fairly clear that channels connect to the client – some channels you own/manage/control – some you don’t – but do ‘we’ really care about ‘controlling’ the channel
  • Clients – described as individuals – but actually individuals, groups, organisations any end user of the services – could be the corporation as an ‘end-user’

Categories viewed as ‘bounded’? – presumably discrete? – but I wonder whether the distinctions are so clear cut?   Can you fulfill more than one of those roles at different times

Another thought is that the domain model is aiming to describe the world of the HE library sector.  If the objective is to engage the HE learning domain in allowing them to understand the contribution that HE libraries make to the Student experience; then, it follows that there has to be terminology, vocabulary and language that resonates with the decision makers in HEIs.   It strikes me then that perhaps we need to look at the library domain model through the perspective of stakeholders – and design views of the model that articulate the value of libraries to them.    In other words, how the library domain fits into the HE sector domain.

Final thought – fascinating to find that the HE library sector has exactly the same ‘why are we here?’, ‘what are we for?’,  ‘what are we going to do’, navel gazing tendencies that the public library sector goes in for.

Twitter posts



July 2020

Creative Commons License