Preserving for future access
Launched in 2005, Portico has now attracted an impressive number of publisher and library partners. Executive director Eileen Fenton tells us why preserving information is important
Why is preservation important?
Libraries often no longer hold copies of materials locally. They are on the publishers’ servers. Libraries and publishers understand how access is happening today but they are less clear how it will happen in the future, especially if a publication ceases to be published.
Our mission is to preserve scholarly literature so that it can be accessed over time. We work with publishers, initially to preserve e-journals but now also to preserve e-books. We are also looking at digitised content, such as newspaper collections, which are important to many libraries and scholars.
Publishers create materials in many different formats. Our data specialists and programmers help to convert content from the publisher’s format to an archival format. Both versions are then put into the Portico archive. All of the metadata that comes with various publications are also converted into archival format, as well as keeping the original, and we can track them all. We also keep events metadata so that we can track everything that happens with each file.
Currently we have nearly 8 million articles, which equates to about 80 million files and approximately 7.5Tb of data. The data is surprisingly small because the content is currently largely text-based. Another 14-15 million articles have been promised. These come from nearly 8,000 journals from around 60 publishers.
We hold these materials on behalf of publishers and libraries. If a publisher were to go out of business or cease to publish a title [a so-called trigger event], Portico would be able to provide access. We only provide this access to those libraries that fund Portico’s ongoing preservation work, but we provide it to all our libraries, irrespective of whether they previously had subscriptions to that title.
About 469 libraries from 13 countries are Portico participants. Information suffers from the free-rider situation, where people assume that somebody else will pay, and we try to avoid that with preservation. The contributions that we seek from libraries vary in size depending on the libraries’ materials expenditure. Publishers also contribute and their fees are linked to their revenue.
What are the benefits?
There is concern about long-term access. Libraries can ensure that there are no gaps in their collections by working with an independent third party like Portico. We routinely receive queries from libraries about what content is actually in the archive. In response to these, we provide customised reports to libraries that compare the archive’s holdings to a library’s local holdings or subscriptions.
This comparison service has proven to be quite useful to libraries as they make decisions about what print to discontinue or to move to remote storage or deaccession.
Publishers can also designate Portico to accommodate post-cancellation access to their journals. It relieves the administrative burden on them from people they are no longer receiving subscriptions from, provided that these libraries fund Portico’s ongoing work. We are starting to see libraries inserting requests into their publisher agreements that Portico be the location for post-cancellation access.
It is also in the publishers’ own interests. They might have their own backups, but it is good to have an independent archive. The process of moving content from a proprietary format to an archival format can enable us to detect systematic errors in publishers’ files. This might be, for example, fl aws in JPEGs created by their vendors. We can give them feedback about problems in their data and insight into industry-wide standards and trends.
Who do you work with?
We link with a variety of organisations including the Digital Preservation Coalition in the UK, the Center for Research Libraries (CRL) in the USA and the US National Library of Congress’ National Digital Information Infrastructure and Preservation Program (NDIPP). We also have a formal agreement with Koninklijke Bibliotheek, the National Library of the Netherlands for it to hold an offl ine copy of the Portico archive that we deliver at least twice a year. We are also looking for another off-site partner.
We work quite closely with CrossRef, considering, for example, what sort of adjustments need to be made to an article’s links when a journal is triggered.
What are the big challenges ahead?
As publishers and libraries have become more digitally orientated, the need for preservation has become embedded in what they do. Libraries are thinking about print and digital preservation in a holistic way. Initially, publishers were under pressure from libraries to make preservation available, but now they are trying to stay ahead of libraries and they expect us to preserve their whole portfolio.
One of the challenges is from the quantity of content to be preserved. We’ve made good progress with journals, but there is a growing amount of books and digitised collections content. In addition, there are big challenges with preserving databases because they are dynamic, not static. Nobody has completely resolved this yet, but people are working on it.
Interview by Siân Harris