Skip to main content

Research Data Management @ Pitt

This guide will assist researchers in planning for the various stages of managing their research data and in preparing data management plans required with funding proposals.

The benefits of data sharing

Data sharing is mandated by many a growing number of funding agencies, foundations, and journals. There are many benefits of data sharing to individual researchers and the research community:

  • Promotes new discoveries
  • Increases research impact and citation rates
  • Supports validation and replication
  • Enhances collaboration
  • Increases returns from public investment
  • Reduces redundant research

How to share

For researchers, there are a number of ways for sharing datasets beyond their own research teams.

  • Depositing data in a disciplinary repository
  • Depositing data in D-Scholarship, the University of Pittsburgh's institutional repository
  • Publishing in a data journal
  • Submitting data with a journal article as a supplemental file or repository that the journal publisher requires/recommends
  • Using a personal or research group website, wiki, blog, or other web-based tool (note that these tools may be effective in the short term but should not be viewed as solutions for long-term sharing and preservation)

Early in a project, researchers should determine whether there are any institutional, funder, or legal restrictions that would prevent or place conditions on the sharing of their data. For example, in order to share some types of data, you may be required to develop a Data Use Agreement that is signed by the Office of Research. See the Data Use Agreement (DUA) Frequently Asked Questions page on the Office of Research's website for more information on this possible requirement.

1. Cornell University's Research Data Management Service Group's page on "Sharing Data" was a useful resource in developing this list of options for sharing.

Preparing your data for sharing

Data creators may have to format, describe, clean, and de-identify their data to ensure that other researchers will find the datasets useful and understandable and in order to protect, if applicable, the privacy of human subjects.The UK Data Archive offers rich guidance on "Preparing data for deposit" that is relevant starting point for researchers who are sharing their data with other researchers and who are publishing their data through deposit in a data repository. Researchers should follow any instructions that journal publishers and repositories provide on readying their data for deposit.

Depositing your data in a disciplinary repository

Research data repositories can host, provide persistent access to, and preserve datasets. For many disciplines, there are repositories familiar to and well-used by researchers in the field (for example, in social science disciplines, ICPSR is an notable data archive). In addition to considering disciplinary practices around data deposit, researchers should determine whether their funder or publisher requires or recommends a specific data repository for archiving and making data available. 

For researchers working to locate a repository for storing, accessing, and sharing data, re3data.org, a searchable registry of data repositories, is a very useful starting point. Researchers can browse re3data.org by discipline, data type, and country to discover an appropriate home for their data or to find shared datasets to use in their research. Interested in depositing your data in a fully open repository or one that gives your data a unique and permanent identifier that can be used in citations? re3data.org highlights key characteristics of data repositories with icons that help users to discover repositories that meet requirements of interest.

A number of academic libraries have developed overviews of repositories grouped by broad academic fields. A LibGuide by the University of California Irvine (UCI) Libraries is an exemplary resource in this respect. See their groupings:

 

Depositing your data in D-Scholarship

The University of Pittsburgh’s institutional repository D-Scholarship offers long-term storage for scholarly output. Pitt researchers can upload their published or unpublished work to D-Scholarship, including datasets.

D-Scholarship@Pitt --

  • Accepts nearly any format of file including tar.gz and zip files  

  • Assigns your data deposits a Digital Object Identifier (DOI), a permanent and unique identifier for a digital object that is used in citations that will help others to find and cite your data

  • Allows you to add information that provides important context for your data so that others can discover, understand, and trust the data files

  • Is best suited for datasets that are in an inactive state (i.e., after the completion of a research project)

  • Tracks your work using alternative metrics to help demonstrate your impact and see how others are using your data

  • Can be used to add a catalog-only entry for datasets that you’ve deposited in another data repository

  • Allows for you to make data fully public, private, or available only to the Pitt community. 

Online help and an FAQ page are available at the D-Scholarship@Pitt website.

 

If you are writing a data managing plan and planning to deposit your data in D-Scholarship@Pitt, the following language can be adapted:

Research data from this project will be deposited in D-Scholarship@Pitt, the University of Pittsburgh's institutional repository that is hosted and maintained by the University Library System. D-Scholarship provides stable, long-term storage and ongoing maintenance for datasets and other scholarly products. D-Scholarship will increase the discoverability of the research data as the repository allows indexing by Google and other major Internet search engines, the Pennsylvania Digital Library, and PITTCat+. The data will be described in D-Scholarship using a metadata schema that is based on DataCite and DDI. The data deposited will be assigned a DOI.

 

Publishing your data in a journal

In the sciences in particular, there are a growing number of data journals, which publish data papers as a means to promote data availability and reuse. In May 2014, Katherine Akers (then of the University of Michigan Library and now at Wayne State University's Shiffman Medical Library) developed a non-exhaustive list of data journals, which can be found at the Data@MLibrary blog.

Authors may be able to deposit data as as a supplemental file or files to be accessible alongside the published article online. In addition, journal publishers are increasingly developing policy that requires data reported and used in published studies to be deposited in a repository. For an example of such a policy, see Science's "General Information for Authors."