Deposit the data that your dissertation! Many disciplines are gravitating toward a culture of openness and, with funders increasingly requiring shared data, this gravitation is sure to continue. Early career academics can begin to embrace the principles of open research and transparency by sharing their dissertation and thesis data. Data deposited in D-Scholarship@Pitt can be cited by other users and pointed to as evidence of scholarly impact.
D-Scholarship@Pitt:
The University Library System wants to work with you to help them share their supporting data! We are available to meet for a consultation about preparing your data for deposit in D-Scholarship@Pitt.
This page provides guidance on issues related file management and sharing, including selecting file formats, naming files, and depositing research data in D-Scholarship@Pitt.
Developing a file management approach that makes sense to you can save time and resources in the long-run. Memory is fleeting. An inconsistent or vague file naming convention (as seen in the adjacent image) can make it challenging to identify a relevant or current file later in your research process. It can pose difficulties to producing the research files that support your dissertation following completion of your project.
The format of the electronic data files you work with during your research may be determined by the research equipment and computer hardware and software that you have access to. However, for long-term preservation and ease of sharing, best practices may dictate that the files be converted to a different format after your project has ended. Give some thought to this eventuality at the outset. Considerations include:
Stanford University Libraries - Data Management Services provides a useful overview of preferred file formats. From the Stanford resource:
Containers: TAR, GZIP, ZIP
Databases: XML, CSV
Geospatial: SHP, DBF, GeoTIFF, NetCDF
Moving images: MOV, MPEG, AVI, MXF
Sounds: WAVE, AIFF, MP3, MXF
Statistics: ASCII, DTA, POR, SAS, SAV
Still images: TIFF, JPEG 2000, PDF, PNG, GIF, BMP
Tabular data: CSV
Text: XML, PDF/A, HTML, ASCII, UTF-8
Web archive: WARC
Additional helpful guidelines for selecting file formats can be found at these websites:
For data to be interpretable and useful to others, researchers should document their research workflow, decisions that they make during their research process, and their manipulation of the data. The UK Data Archive outlines a set of best practices for data documentation, which is captured here:
Good data documentation includes information on:
At data-level, datasets should also be documented with:
Variable-level descriptions may be embedded within a dataset itself as metadata. Other documentation may be contained in user guides, reports, publications, working papers and laboratory books (see Managing and Sharing Data UK Data Archive).
Before you begin your research, decide on a naming convention for your files. Document the naming convention you choose, and make sure that you and your collaborators follow it. It will save you time and will help others who may use your files in the future!
When developing your naming conventions, consider the following suggestions:
It's not too late to align file names with a consistent file naming convention that you develop. The following are tools and approaches for renaming a collection of files:
Windows: Bulk Rename Utility (free)
Mac: Recent Mac OS versions allow for bulk file renaming. See support text on the Mac site: "Rename files, folders, and disks on Mac"
Mac: Renamer 6
Linux: KRename (free)