I’ve been reading a great blog post by Peter Morville on Semantic Studios ‘Inspiration Architecture: the Future of Libraries‘ and it includes a great description that really resonated

There was even a big move towards the vision of “library as platform.” Noble geeks developed elaborate schemata for open source, open API, open access environments with linked data and semantic markup to unleash innovation and integration through transparency, crowdsourcing, and mashups. They waxed poetic about the potential of web analytics and cloud computing to uncover implicit relationships and emerging patterns, identify scholarly pathways and lines of inquiry, and connect and contextualize artifacts with adaptive algorithms. They promised ecosystems of participation and infrastructures for the creation and sharing of knowledge and culture.

Unfortunately, the folks controlling the purse strings had absolutely no idea what these geeks were talking about, and they certainly weren’t about to entrust the future of their libraries (and their own careers) to the same bunch of incompetent techies who had systematically failed, for more than ten years, to simply make the library’s search box work like Google.

I’ve highlighted the last bit as it really struck home.  The great search for a library equivalent of the Google Search box is something that is familar to anyone working in trying to build better ways of helping users get to library content.  It has pretty much been a mantra over the past few years.  (for a great summary of how library search systems differ from Google look at Aaron Tay’s blogpost from last May and his blogpost on web scale discovery from December last year.)   So it’s easy to find examples of where libraries and other organisations have tried to put in place a google-like search, from the Biodiversity Heritage Library,  from the American University and from others such as the National Archives and Records Administration (reported in Information Management Journal, March 2011) and Oregon State University (paper by Stefanie Buck and Jane Nichols ‘Beyond the search box’ in Reference & User Services Quarterly March 2012.

The current generation of discovery systems (Summon, EDS, Primo etc) are largely built around the concept of a ‘google-like’ search.  As reported here for McGill University by OCLC for WorldCat Local. In some ways it seems to me that we’ve been concentrating too much on the simplicity of the original Google interface and as Lorcan Dempsey pointed out in his ‘Thirteen Ways of Looking at Libraries, Discovery and the Catalog’

‘a simple search box has only been one part of the Google formula. Pagerank has been very important in providing a good user experience, and Google has progressively added functionality as it included more resources in the search results. It also pulls targeted results to the top of the page (sports results, weather or movie times, for example), and it interjects specific categories of results in the general stream (news, images).’

So although we’ve implemented a ‘google-like’ search box it becomes apparent that it doesn’t entirely solve the problem. It’s a bit like a false summit or false peak.  You think you’ve reached the top but realise that you still have some way to go. Relevancy ranking becomes vitally important and with the Discovery service generation you’ve essentially handed that over to a vendor to control the relevancy algorithm.  You can add your local content into the system and have some control but it is limited.  And you are constrained in what you can add into the discovery platform.  Your catalogue, link resolver/knowledge base generally yes, your institutional repository yes, but your other lists of resources in simple databases, not so easily unless they happen to be OAI-PMH or MARC.

So you look at bringing together content from different systems, probably using the Bento Box approach (as used by Stanford and discussed by them here) where you search across your different systems using APIs etc and return a series of results from each of those systems.  You then get a series of results that come from each of the different systems and incorporate the relevancy ranking of discovery systems, rather than ranking the relevancy of the results in total.   So is that going to be any better for users?  Is it going to be better to sort the results by system, as Stanford have done? or should we be trying to pull results together, as Google do?  That’s something we need to test.

But there’s a nagging feeling that this still all relies on users having to come to ‘the library’ rather than being where the users are.  So OK, we can put library search boxes into the Virtual Learning Environment, so we’ve an article search tool that searches our Discovery system, but if your users start their search activity with Google, then the challenge is about going beyond Google Scholar to get library discovery up to the network level.