'Libraries should open up data'
Data was one of the buzz words at the Online Information exhibition and conference held in London in December, Siân Harris reports on discussions about linking library data
‘The library ecosystem is somewhat stagnant and the threshold for innovation is really high,’ stated Martin Malmsten, senior developer, National Library of Sweden, in a session about linked data at the recent Online Information conference.
He was speaking in favour of libraries opening up their data so that others can use it. ‘It is important that we use the same technology as outside the library. There is a growing amount of structured information that we could interact with. We get enthusiastic people wanting to use our data and then we tell them about the licence conditions and they are not so keen,’ he continued. ‘Licensing is a huge problem for mash-ups. Even permissive licences such as Creative Commons (except for CCO) are problematic for data.’
Allowing others to access and use library data opens up a range of possibilities, as Malmsten and other speakers explained. Consultant Karen Coyle showed the example of a web-based initiative called the ‘Open Library’, where data from across the web is drawn together into machine-generated web pages about different books and authors. Users see a host of links on each page and can track keywords and terms through the materials to see trends, such as books that reference a particular person or idea. She pointed out how much more powerful this type of resource would be if all libraries opened up their data to be used in such a way.
‘Allowing others to link to your data doesn’t destroy your data,’ she argued. She also noted that the use of identifiers can differentiate library-derived data from general web data, which may not be so rigorous in its accuracy. ‘There is a world outside where people don’t have good bibliographic data. If we contribute our bibliographic data we can raise the quality.’
‘Giving away our data could be scary but if it’s paid for by tax payers why shouldn’t it be available?’ Malmsten challenged, going on to add: ‘If library data isn’t open, this is contrary to what libraries should be about.’
Semantic web
‘Linked data provides definitions of relationships and we can all contribute,’ said Sarah Bartlett, senior analyst at Talis. And key to linked data is the role of the semantic web where information is read and used by machines and its relationships are defined.
But there are challenges, as Karen Coyle identified. The records in library catalogues today are based on the style of the card records used in libraries for many years. ‘The down side of this is that the content of the record in the online catalogue looks almost identical to the content of the card, and the card was a text document – a highly-structured document, but essentially a document,’ she pointed out.
‘Everything in the record is potentially a “thing” in the semantic web sense; a thing that could have relationships with other things. With the record as it is, however, all of the things only have a relationship with the item they describe,’ she explained.
And retrieving things by machine from these records is not straightforward. ‘What is data to a person is not data to a computer or any other method of machine manipulation. Some of the most important data elements in our records, like ISBNs, aren’t machine-actionable because they are in text fields often together with text,’ she added.
Another challenge is that the semantic web is still at a stage of being low-level code, or, as Coyle put it, ‘at the stage where the only people who really understand this are pretty geeky, not at the point where an information professional needs it to be for them to see how it can be used.’
She continued: ‘I don’t think the semantic web will be heavily used until it is as easy to use as the web. It could look very much like the web today but the semantic web will be richer because the links have meaning and are contextual.’