You are currently browsing the tag archive for the ‘David Weinberger’ tag.

I was intrigued to read David Weinberger’s blog post ‘Protecting library privacy with a hard opt-in’  in it he suggests that there is a case to be made for asking users to explicitly opt-in to publishing details of their checkouts (loans) before you can use that activity data.  I must admit that I’d completely missed the connection between David Weinberger author of ‘Everything is miscellaneous’ and his role with the Harvard Innovation Lab and I’m sure I’ve probably blogged about both in the past.

The concern that has been raised is about re-identification, where supposedly ‘anonymous’ datasets can be combined with other data to identify individuals.  There’s a good description of the issue in this paper from 2008 from Michael Hay and others from the University of Massachusetts http://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1176&context=cs_faculty_pubs

Obviously an issue of this type is of critical significance when you might be talking about medical trials data for example, but library data might also be personal or sensitive.  Aside from the personal aspects you could also imagine that a researcher carrying out a literature search for material for a potential new research area would not want ‘competitors’ to know that they were looking at a particular area, particularly now cross-domain research activities are more common.

The issue of anonymity and potentially being able to identify an individual from their activity data is an area that has been explored through a number of projects, such as in Jisc’s Activity Data programme and synthesis project outputs at http://www.activitydata.org  particularly in the section on data protection.   Most of the approaches tackled anonymization in two ways, by replacing user IDs with a generated ID (described interestingly by Hay as ‘naive anonymization’) and by removing data from the dataset if there were only small numbers of users included (such as a course with only a few students enrolled).

Re-identification techniques seem to work by being able to identify unique patterns of use, called digital fingerprints that can be used to identify individuals.  When you combine data from an anonymized dataset with other material you can start to identify individuals.  It certainly seems to be something that needs to be thought carefully about when contemplating releasing datasets.

Is the suggested solution, of asking for explicit permission the right approach?  If you are planning to release data openly, I’d probably agree.  If you plan to use it only within your systems to generate recommendations, then yes it’s probably good practice. I worry slightly about the value of the activity data if there is a low opt-in level.  That may significantly diminish its value and usefulness.

I’m not too convinced though about the approach that says that users agree to a public page that lists your activity.  That would seem to me to encourage people who might not be unhappy with allowing their data to be used unattributed in recommendations not to opt-in.  When we’ve asked students about their views of what data we should be able to use they were quite happy for activity data to be used.   My view would be that it’s fine to show an individual what they have used (and we do that), but not something to share.

Photograph of 'Everything is Miscellaneous' bookEverything is miscellaneous
I’ve finally got around to reading “Everything is Miscellaneous” by David Weinberger, (yes I know that is about five years after everyone else, and no real reason not to read it, other than a sense of not wanting to follow everyone else.)    I’m reading it in paperback form as we don’t seem to have it on ebook which gives it a slight sense of being older than it actually is.  Particularly with the pages in the paperback being slightly yellowing.  It is also interesting to me to pick up a library book added to stock in 2009 that has two date stamps on the date label. Two loans in four years brings home what a different world academic libraries are from public libraries.

While there’s a slight sense of things having moved on in the post – twitter world in terms of some of the technologies, it is a really interesting read with lots of things to think about and it is really making me think about the approach we take to providing access to library materials.  I am particularly thinking about how we present material through our library website, either with search tabs for articles, books etc, or by categorising library resources into journals, databases or ebooks, or even by us using different systems to manage different types of material.  As David Weinberger points out that is just a carry over from the old analogue and physical world that makes no real sense to users in a digital world.  And that is something that needs reinforcing regularly as it is easy to lose sight of that.

Tagging, sharing and perspective
One of the things that is starting to come out of our personalisation surveying and focus groups is that users want what is relevant to them.   Well, not a great surprise, but then that isn’t something that our systems really faciliate do they?  Where we are at the moment is to still think in terms of how you get something depends on what type of thing it is.  For a physical library that’s relevant in that the leaf is only on the tree in one place, to use Weinberger’s analogy.  But in the digital world, all the stuff is website content, and all the constraints are artificially created (that doesn’t mean that they are not necessary in some cases).   So you access ebooks through the catalogue because that is where we put them, often for our administrative convenience.  But users might want them in different places at different times.  But in a world where users expect to be able to shape their view of the world by customising the ‘library channel’ as you can do with Spotify or any number of web-scale services, the single ‘take-it or leave-it’ library approach seems curiously archaic. 

Discovery
So what does that mean for discovery and especially for discovery systems?  Are discovery systems the right solution?  Discovery systems and the Google-like search box are an attempt to pull stuff together into one place.  So upload your catalogue into your discovery platform and you can lose the OPAC – maybe.   It seems to me to start to pick up on relevancy ranking becoming a much more important area.  But it still doesn’t really start to approach anything that is particularly ‘socially’ or ‘user-aware’.

As a user you probably want to decide what is relevant to you, you might want to tag that content and probably share it too.  And you’d probably expect to be able to see other users tag and use them to find material relevant for you too.   But with library systems we take the view that we have to have special people who we trust to add accurate metadata.  I hate to say this, but I think that’s another legacy of the physical age and not really viable for the explosion in digital content that is upon us.

So you start to have a model where users expect the system to know something about them (what course they are on for example – does your discovery platform know that?), and to filter based on their likely interests, but then to learn from what they search for (and what others search for, or tag) to find other things they might be interested in.  I start to think that this is at the heart of user disatisfaction with library systems, there is a great disconnection with their experience of the rest of the web.

Is it feasible, could we experiment, what might that space look like?  Discovery is miscellaneous now…

Twitter posts

  • RT @OU_Library: The OU Library wanted to bring you a short message to let you know that we are here for you 24/7. We hope you enjoy 😊 htt… 3 weeks ago
  • RT @OpenUniversity: Since 1969, our archive team have kept a record of almost everything we've ever taught and some key archives that form… 4 weeks ago
  • ICA - The Open University Archive ica.org/en/what-archiv… 2 months ago
  • RT @RLUK_David: We were one of the bodies who lobbied UKRI to extend the deadline on their OA Review Consultation, so it's great to see a p… 2 months ago
  • RT @OU_Library: THE LIBRARY HELPDESK IS STILL OPEN! Watch our latest YouTube video and find out from one of our librarians, Jude, how you… 2 months ago

Categories

Calendar

June 2020
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
2930  

Creative Commons License