Pitt community: write to Digital Scholarship Services or use our AskUs form
Pitt health sciences researchers: contact Data Services, Health Sciences Library System
Dominic Bordelon, dbordelon@pitt.edu
"Data Sharing @ Pitt" by University of Pittsburgh Library System is licensed for reuse under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
What is "Data Sharing?" In academic research contexts, data sharing refers to the posting of raw and/or processed data in a repository for use by other researchers. Typically, access to such repositories is free and open to the general public via the internet, although access may need to be controlled if there are sensitive data. Often nowadays, when a researcher submits a manuscript, the publisher will also require that the data supporting the findings be shared in an open repository; however, researchers may also choose to share data outside of the traditional publication scope. In fact, some disciplines, such as genomics, rely heavily on readily available big data; meanwhile, data centers such as the National Radio Astronomy Observatory (NRAO) Archive support research by persistently offering data produced by facility-scale instruments.
Here are some of the reasons for data sharing, according to agencies and proponents:
Data sharing is also part of the Open Science movement and an extension of the Open Access (OA) movement. Data that have been shared are sometimes called "Open Data," but note that this term may also refer to civic and governmental data.
Before eagerly posting one's data, there are several elements to consider:
The University of Pittsburgh does not require researchers to share their data. However, the University does have a data retention policy of seven years (Nordenberg 2009). Additional data guidance and other resources can be found on the University's Human Research Protection Office website.
The only major funder currently requiring data sharing in proposals is the National Institute of Health (NIH), effective since January 2023. Find out more about NIH data sharing requirements from Pitt's Health Sciences Library System (HSLS). That said, other agencies are expected to join the NIH in this requirement soon.
The National Science Foundation (NSF) has announced NSF Public Access Plan 2.0, which requires (effective for proposals submitted January 2025 onward) Open-Access publication of manuscripts and sharing of supporting data in repositories. Budget allowances are made for costs associated with data management and sharing. More specific plans and requirements may be announced by individual NSF directorates.
Both NIH and NSF are responding to the 2022 OSTP Memo by Director Alondra Nelson, which mandates that federal granting agencies require data sharing by December 31, 2025. Additionally, there is to be no embargo on shared data; some sharing schemes so far have embargoed (delayed) data sharing until a year after manuscript publication. The memo expresses better science as a goal, but also equitable access to research outcomes. The research community's response to COVID-19 is cited as a particular success story of data sharing.
Increasingly, publishers are requiring sharing of supporting data. A list can be found at the Publisher Data Availability Policies Index.
Thinking about sharing your own data? Check out the pages below (or in the navigation bar) to learn more.