Skip to Main Content

Course & Subject Guides

Data Sharing @ Pitt

Learn about the principles and how-to of sharing academic research data.

Choosing a repository for sharing your data

Data sharing entails choosing a permanent steward for your data files, who will host them on the internet. Repositories for research data offer several benefits:

  • Lower risk of loss to sunsetting (after a researcher leaves the institution, for example)
  • Discoverability thanks to combination with other data sets, and leveraging of scholarly technologies such as digital object identifiers (DOIs)
  • More machine-readability for research software than a typical website, further enhancing discovery and utilization
  • Compliance with funder and publisher recommendations/requirements

So, which repository? The options fit into these categories:

  • a disciplinary repository and/or a repository purpose-built for your type of data. Find these by looking at publications in your field and/or searching on re3data.
  • a generalist repository, which accepts data from researchers in any field, working at any institution. Popular options that are free (up to a certain file size) include Zenodo, Open Science Framework (osf.io), and Figshare.
  • D-Scholarship, Pitt's institutional repository, where any Pitt affiliate may deposit their research outputs

Among factors to consider are cost (if any), space (if greater than ~1 GB), goodness-of-fit between the repository and your work, and repository services/features offered. If you're looking to compare the generalist repositories against one another, check the Generalist Repository Comparison Chart (Stall et al. 2023).  

For a more detailed selection process, see the Data Repository Selection Decision Tree (Enabling FAIR Data Community 2018) or contact us.

💡 For data sharing purposes, GitHub is not sufficiently preservationist—i.e., guaranteeing access in the long term—to be considered a research data repository. While it is a great place to work/collaborate on your project, you should also use one of the options described above.

Choosing a license for sharing data

A license communicates to prospective users what conditions you would like them to follow should they want to use the data, including how you would like the data to be cited.

Available options include:

Choice of license may also be constrained by the repository where you deposit the data. Dryad, for example, enforces a CC0 rights waiver on all submissions.