Recommendations for the Role of Publishers in Access to Data

Jennifer Lin; Carly Strasser

doi:10.1371/journal.pbio.1001975

Abstract

As appeals for public access of research data continue to proliferate, many scholarly publishers—alongside funders, institutions, and libraries—are expanding their role to address this need. Here we outline eight recommendations and a set of suggested action items for publishers to promote and contribute to increasing access to data. This call to action emerged from a summit that brought together data stewardship leaders across stakeholder groups. The recommendations were subsequently refined by the community as a result of public input gathered online and in meetings.

Citation: Lin J, Strasser C (2014) Recommendations for the Role of Publishers in Access to Data. PLoS Biol 12(10): e1001975. https://doi.org/10.1371/journal.pbio.1001975

Published: October 28, 2014

Copyright: © 2014 Lin, Strasser. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The authors received no specific funding for this work.

Competing interests: Jennifer Lin is an employee of PLOS whose salary is supported by PLOS income derived from the publication of Open Access papers.

Background

Institutions that support research have a vested interest in preserving and promoting the work of their researchers. This work includes scholarly publications, software, datasets, reports, and other outputs. A 2013 memorandum from the White House Office of Science and Technology Policy (http://www.whitehouse.gov/administration/eop/ostp) requires that government funding agencies ensure all research output that results from work they support be publicly available. In the United Kingdom, Research Council policies (http://roarmap.eprints.org/671/1/RCUK%20_Policy_on_Access_to_Research_Outputs.pdf) require that data be made available and preserved for 10 years and that research publications contain a statement on how the underlying materials—such as data, samples, or models—can be accessed. More widely, the European Union Horizon 2020 program (http://ec.europa.eu/programmes/horizon2020/) includes an Open Research Data pilot (http://europa.eu/rapid/press-release_IP-13-1257_en.htm) that will require data sharing of grantees.

These new policies have significant repercussions: stakeholders such as institutions and funders will need to provide researchers with the means to preserve and provide access to their research outputs. At the same time, librarians, information technologists, preservation specialists, and others have a long history of providing infrastructure, education, and support for preserving and promoting researchers' outputs. These new policies only bolster the importance of their efforts as they relate to data [1],[2]. Publishers are also a critical stakeholder group as the current “gatekeepers” of formal scholarly research and, increasingly, of other research outputs beyond the research article.

Given this climate of new mandates, changing roles, and increasing challenges, we convened a meeting with a group of leaders in data stewardship to discuss, “What can publishers do to promote the work of libraries and institutions in advancing data access and availability?” The event coincided with the International Digital Curation Conference on February 26, 2014. A diverse group of data experts was present (Box 1), including repository heads, librarians, funders, infrastructure builders, program directors, developers, and researchers. The group developed a range of priorities and recommendations for publishers. To allow attendees to establish a common voice and brainstorm freely, publishers were intentionally not included in the discussion (individuals affiliated with PLOS were present only in the capacity of hosts and facilitators).

Box 1. Contributors of the Role of Publishers Meeting on February 26, 2014

Participants

Stephen Abrams, Associate Director of University of California (UC) Curation Center, California Digital Library
Rachel Bruce, Director, Technology Innovation, JISC
Eleni Castro, Research Coordinator, Institute for Quantitative Social Science (IQSS), Harvard University
Patricia Cruse, Director of UC Curation Center, California Digital Library
Ingrid Dillo, Head Policy Communication Development, Data Archiving and Networked Services (DANS)
Alex Garnett, Data Curation and Digital Preservation Specialist, Simon Fraser University
Jennifer Green, Director of Research Data Services, University of Michigan
Simon Hodson, Executive Director, CODATA
Eric Kansa, Technology Director, Open Context
Belinda Norman, Research Data Manager, University of Sydney
Mark Parsons, Secretary General, Research Data Alliance
Jonathan Tedds, Senior Research Fellow, University of Leicester
Todd Vision, Principal Investigator, Dryad; Associate Director for Informatics, National Evolutionary Synthesis Center

Hosts

John Chodacki, Director of Product Development, PLOS
Jennifer Lin, Senior Product Manager, PLOS
Cameron Neylon, Advocacy Director, PLOS
Carly Strasser, Data Curation Specialist, California Digital Library

The outcomes of this summit were then submitted to the community for comment. The public solicitation for input was detailed in two blog posts by PLOS (http://blogs.plos.org/tech/feedback-wanted-publishers-data-access/) and the California Digital Library (CDL) (http://datapub.cdlib.org/2014/03/24/feedback-wanted-publishers-and-data-access/) and promoted across social media outlets. An additional feedback session was held at the Third Plenary of the Research Data Alliance (RDA) on March 28, 2014, the largest gathering of the international data community. The report below presents the public endorsements, which have been amended, validated, and refined over the course of 2.5 months by the community at large. While this effort was intentionally designed to speak to publishers, we encourage companion efforts to establish community-based endorsements aimed at other critical stakeholders in the research ecosystem.

Call to Action

As a community, we envision a future information ecosystem in which research data is considered an integral part of scholarly communications. We propose a new metaphor for this vision: a social contract. This contract is an agreement amongst all stakeholders based on shared governing principles: data should be preserved, discoverable, measured, and integrated into evaluation processes, and data sharing is a fundamental practice. Adherence to this social contract will entail dramatic changes to existing workflows, technologies, and social norms for all the members of the research ecosystem.

While data stewardship requires expertise and knowledge that will be spread across other stakeholder groups (data centers, researchers, librarians, etc.), this document addresses the potential role of publishers in promoting the collective vision. Publishers play a critical role in this collective space, be they commercial, nonprofit, society, open access, institutional, etc. Because of the importance of formal publications in the academic incentive structure, they occupy a leverage point in the research process. We see an opportunity for them to become a strong force in effecting social and technical change. They can serve as the implementation and/or enforcement arm at the point of publication for the governing principles mentioned above. They have the potential to serve as honest brokers, listening to concerns from institutions and libraries about issues concerning data curation and publication and engaging with the stakeholders to help establish and enforce agreed-upon standards that suit the community as a whole and ensure access to data underlying the works they publish. Publishers can strive to be honest and transparent about their services and the costs of those services, especially if data archival costs are incurred. Above all, they can collaborate and coordinate their efforts with repositories and funders to cement the principles of data sharing and reuse as mutual stewards of this new ecosystem.

Recommendations

Collectively, we recommend a comprehensive approach that encompasses the entire research process. We present eight action items for publishers to promote the work of libraries and institutions in advancing data preservation and access (Box 2). These are illustrated with concrete examples of projects that would support the high-level recommendations.

Box 2. Recommendations for Publishers to Increase Access to Data

Establish and enforce a mandatory data availability policy.
Contribute to establishing community standards for data management and sharing.
Contribute to establishing community standards for data preservation in trusted repositories.
Provide formal channels to share data.
Work with repositories to streamline data submission.
Require appropriate citation to all data associated with a publication—both produced and used.
Develop and report indicators that will support data as a first-class scholarly output.
Incentivize data sharing by promoting the value of data sharing.

1. Establish and enforce a mandatory data availability policy

The incentive structure for scholars is currently based on publishing journal articles: frequent publication, especially in high-impact journals, is perceived as a reliable indicator of a successful academic researcher for most disciplines [3]. Regardless of whether this incentive structure is ideal, it means that publishers are important gatekeepers in communicating science. In this role, publishers have a unique opportunity to effect change by requiring that data supporting the results of a publication be openly and freely available, by default. We recognize that there are some cases in which this is not possible due to privacy, sensitivity, or ownership issues, but these are exceptional cases and should be treated as such.

Not only should there be a policy in place, but it should be enforced. Many publishers “request” or “strongly recommend” that researchers make data openly available, but these policies are perceived as optional and rarely result in data availability. To this end, we recommend that the policy be applied as a mandatory one. Vines et al. [4] found that mandated archiving policies increased the odds of finding associated data by almost 1000-fold. This suggests that by establishing and enforcing a data policy, publishers can have a dramatic effect on data availability.