Repositories play their part in institutions
Sian Harris speaks to four providers of institutional repository software and services
Institutional repositories (IRs) play a key role in open-access policies and programmes – and in showcasing an institution’s activities.
‘Institutional repositories are evolving; initially they were local repositories for the scholarly output of an institution – primarily documents like theses, dissertations, and journal publications. However, modern IRs need to support a broader range of content, including multimedia and research datasets (which can be large and complex),’ noted David Wilcox product manager for Fedora with DuraSpace. ‘Additionally, IRs are starting to participate in the world of linked open data, including external authorities for authors and subjects, as well as the capability to openly share local resources in a standardised way.’
For Irene Kamotsky, director of strategic initiatives at Bepress: ‘The two themes we see are low-hanging-fruit: digital content that is already in hand or easy to get and has few copyright issues; and content that is tied to a strategic goal or high-level initiative at the institution. She noted that ‘previously published articles are still a core collection at many institutions, but these other types of projects drive more readership and more engagement on and off campus.’
IRs today often need to support research management functions, gathering materials for research assessment. Leslie Carr, a professor in the Web Science Institute of the UK’s University of Southampton and one of the team behind the open-source eprints software commented, ‘With the UK’s Research Excellence Framework (REF), for example, there is particular reporting required and we have to add in extra fields and tighten up the metadata – but, REF or no REF, a repository needs to tell the story of the institution.’
Preservation is an important component of IRs too. ‘The primary use case is the long- term preservation of digital assets,’ commented Samantha Fritz, interim project and community manager of Islandora Foundation. ‘Many institutions use IRs to bring awareness to unique materials and content within archives and special collections. IRs are a great opportunity for an expansion in outreach as individuals across the world can access digital objects that had, in the past, largely been limited to the traditional physical setting.’
Different approaches
The picture of repositories and repository software is complex, with a combination of home-grown solutions, open-source tools and commercial products and services, many of which are built on open-source software.
Digital Commons from Bepress has an interesting history because the company started life as a publisher and the Digital Commons platform has retained publishing services features such as peer-review management in addition to its IR capabilities. ‘When a school uses Digital Commons they use it for all traditional IR functions but also for library-led publishing,’ explained Kamotsky.
One of the questions people debate with repositories is whether institutional or subject- based repositories are the best approach. Carr of eprints noted that people in disciplines with established subject repositories are very much in favour of them and that there are many advantages, for example the huge coverage of ArXiv. On the other hand, he said, ‘evidence suggests that those disciplines that are well matched to having subject repositories probably already have one.’
In addition, he noted that there is a challenge of where to host subject repositories. ‘They are not particularly expensive but they are not free,’ he observed, giving the example of how ArXiv had to leave Los Alamos National Laboratory and was then given a new home at Cornell University. ‘Where does the responsibility for producing the research and employing the researcher lie? It’s in the institution,’ he added.
Discovery
Discovery of content is an important part of the value of any repository. ‘Discoverability, including features that allow users to search and browse, is critical considering that IRs tend to house extraordinary amounts of content and data. Without discovery layers and tools, these collections would be virtually invisible,’ explained Fritz of Islandora Foundation.
There are some key ways to enable this. ‘One of the lessons that people learnt very quickly with repositories is that the vast majority of access is driven by Google. The single most important thing that they can do is to make sure that the repository can be easily crawled by Google,’ observed Carr.
Kamotsky of Bepress agreed: ‘We are constantly updating our platform to be optimised for open web searches such as Google and Google Scholar. Our repositories receive millions of downloads, and around 85 per cent of those come from open web searches. Digital Commons is also OAI- PMH compliant and is easily integrated in library discovery tools and aggregators.’ She continued: ‘Universities and departments are interested in showcasing content coming from their campus, but individual researchers are more interested in what’s going on within their field, regardless of university.
‘We saw a need for a tool that would gather content from all 300-plus Digital Commons repositories and make it easy to browse on a disciplinary or topical basis. This idea developed into the Digital Commons Network, which makes over a million full-text, OA research articles available for researchers to browse.
‘We see this as a valuable tool for researchers and a way for libraries to build interest in their own repository initiatives, but it also demonstrates the magic that happens when you create connections between hundreds of successful repositories.’
‘Interoperability has always been a major focus for us,’ said Carr. ‘The eprints software was developed by researchers and we are also users of the software. As computer scientists and information scientists we are very aware of the need to build software that is not just silos.’
He added that it is also very easy to link to external reporting tools, which is important when institutions need to use their repository to feed into research assessment.
‘In the early days interoperability was not as big of a concern, but it’s increasingly becoming an issue – particularly when it comes to authorities and unique persistent identifiers,’ added Wilcox of Fedora.
‘Content in repositories is now expected to be permanently accessible via a unique URI, primarily so it can be shared and reused in other contexts. The rise of linked open data also points to this desire for interoperability.’
Fritz, of Islandora Foundation, noted: ‘The library community has a particular interest in interoperability, whether in the context of global OA collections of scholarly content, or, more locally, in the context of a consortium of libraries. Islandora responds to this by supporting OAI and by providing a system that facilitates the hosting of organisations in a single framework.’
The future
Fritz also noted some of the challenges ahead, including the rapid change in technology and the need to support and maintain software and processes. In addition, there is copyright management ensuring that an institution has the right to disseminate digital content and confirming copyright clearance before depositing material into an OA platform. There is also the need for ‘sustainability feasible long term funding and support’ and increasing the visibility of scholarship by integrating systems like Altmetric. She also noted the need for IRs to ‘step up to the “data stewardship” bar.’
Kamotsky of Bepress added: ‘Faculty are increasingly savvy about their digital presence and visibility for their scholarship. One challenge is to make sure that repositories keep up with faculty’s rapidly emerging needs. IRs today find themselves competing for faculty’s attention with services like ResearchGate, Academia.edu and even Amazon’s self-publishing tools. Authors don’t read the fine print with commercial entities like these, and may find themselves giving rights away just like they did with commercial journals.
‘Librarians have a uniquely appropriate skill set to encourage safe OA scholarly publishing, and they have a great opportunity to become the go-to experts on campus and help guide faculty and students towards non-predatory publishing. It’s new territory for librarians to compete for content with commercial alternatives, but it’s vitally important that they take a leadership role in building a scholarly communications system that appeals to faculty, so that faculty don’t need to look elsewhere.’
According to Samantha Fritz of Islandora Foundation: ‘The requirements for an IR will differ depending on the platform being used. However, broadly speaking, an institution implementing an IR needs organisational commitment to supporting an IR project; the ability to dedicate individuals to the project, from the managerial side to the technical and administrative aspects; and familiarity with web-based software and databases, systems, languages, and metadata practices.
‘The organisation also needs to define a scope for the types of digital assets that will be captured and displayed by the IR. IRs, to varying degrees, offer flexibility and customisation. It’s great when you see institutions using outside the box solutions to deal with unique instances of digital assets. While this requires some risk taking, an IR can be as complex and rich as an institution is willing to build.’
Kamotsky of Bepress had some further advice for institutions: ‘The repository is a tool to manage the image of the university, and it must be a professionally designed and curated showcase of works that faculty, students, and administrators are proud to show others.
‘A repository platform needs to display beautifully all types of content, including images, video, audio, and datasets. To engage faculty, students, and university administration, a successful repository needs to increase the visibility of their work, and provide quantitative evidence of their global reach,’ she commented.
‘On the library’s side,’ she added, ‘successful IR programmes are those that focus on campus outreach rather than systems and technology. The ideal repository manager is an extrovert who’s excited about talking to faculty and students about scholarly communications. The library shouldn’t assume that faculty will automatically understand the value of the repository and be willing to upload their own material. Libraries should provide a comprehensive suite of services designed to appeal to faculty’s emerging needs. At schools using this model, repository and publishing tools become one of the most well-used and well-loved library services on campus.’