Data sharing is mandated by many a growing number of funding agencies, foundations, and journals. There are many benefits of data sharing to individual researchers and the research community:
For researchers, there are a number of ways for sharing datasets beyond their own research teams.1
Early in a project, researchers should determine whether there are any institutional, funder, or legal restrictions that would prevent or place conditions on the sharing of their data. For example, in order to share some types of data, you may be required to develop a Data Use Agreement that is signed by the Office of Research. See the Data Use Agreement (DUA) Frequently Asked Questions page on the Office of Research's website for more information on this possible requirement.
1. Cornell University's Research Data Management Service Group's page on "Sharing Data" was a useful resource in developing this list of options for sharing.
Data creators may have to format, describe, clean, and de-identify their data to ensure that other researchers will find the datasets useful and understandable and in order to protect, if applicable, the privacy of human subjects.The UK Data Archive offers rich guidance on "Preparing data for deposit" that is relevant starting point for researchers who are sharing their data with other researchers and who are publishing their data through deposit in a data repository. Researchers should follow any instructions that journal publishers and repositories provide on readying their data for deposit.
Research data repositories can host, provide persistent access to, and preserve datasets. For many disciplines, there are repositories familiar to and well-used by researchers in the field (for example, in social science disciplines, ICPSR is an notable data archive). In addition to considering disciplinary practices around data deposit, researchers should determine whether their funder or publisher requires or recommends a specific data repository for archiving and making data available.
For researchers working to locate a repository for storing, accessing, and sharing data, re3data.org, a searchable registry of data repositories, is a very useful starting point. Researchers can browse re3data.org by discipline, data type, and country to discover an appropriate home for their data or to find shared datasets to use in their research. Interested in depositing your data in a fully open repository or one that gives your data a unique and permanent identifier that can be used in citations? re3data.org highlights key characteristics of data repositories with icons that help users to discover repositories that meet requirements of interest.
A number of academic libraries have developed overviews of repositories grouped by broad academic fields. A LibGuide by the University of California Irvine (UCI) Libraries is an exemplary resource in this respect. See their groupings:
The University of Pittsburgh’s institutional repository D-Scholarship offers long-term storage for scholarly output. Pitt researchers can upload their published or unpublished work to D-Scholarship, including datasets.
Accepts nearly any format of file including tar.gz and zip files
Assigns your data deposits a Digital Object Identifier (DOI), a permanent and unique identifier for a digital object that is used in citations that will help others to find and cite your data
Allows you to add information that provides important context for your data so that others can discover, understand, and trust the data files
Is best suited for datasets that are in an inactive state (i.e., after the completion of a research project)
Tracks your work using alternative metrics to help demonstrate your impact and see how others are using your data
Can be used to add a catalog-only entry for datasets that you’ve deposited in another data repository
Allows for you to make data fully public, private, or available only to the Pitt community.
If you are writing a data managing plan and planning to deposit your data in D-Scholarship@Pitt, the following language can be adapted:
Research data from this project will be deposited in D-Scholarship@Pitt, the University of Pittsburgh's institutional repository that is hosted and maintained by the University Library System. D-Scholarship provides stable, long-term storage and ongoing maintenance for datasets and other scholarly products. D-Scholarship will increase the discoverability of the research data as the repository allows indexing by Google and other major Internet search engines, the Pennsylvania Digital Library, and PITTCat+. The data will be described in D-Scholarship using a metadata schema that is based on DataCite and DDI. The data deposited will be assigned a DOI.
In the sciences in particular, there are a growing number of data journals, which publish data papers as a means to promote data availability and reuse. In May 2014, Katherine Akers (then of the University of Michigan Library and now at Wayne State University's Shiffman Medical Library) developed a non-exhaustive list of data journals, which can be found at the Data@MLibrary blog.
Authors may be able to deposit data as as a supplemental file or files to be accessible alongside the published article online. In addition, journal publishers are increasingly developing policy that requires data reported and used in published studies to be deposited in a repository. For an example of such a policy, see Science's "General Information for Authors."