Skip to Main Content

Course & Subject Guides

Data Sharing @ Pitt

Learn about the principles and how-to of sharing academic research data.

Usage of disciplinary information standards

Information standards, and more specifically, metadata standards, help others make sense of and reuse your data. A simple example is to consider various ways of formatting the date January 10, 2023: US-style 1/10/23 (month first), or European-style 10/1/23 (day first). Without some context, one would have difficulty interpreting which meaning was intended when encountering either of these date-strings ("did the data creator mean January 10th, or October 1st?"). To avoid such ambiguity, we utilize ISO 8601, an international information standard for formatting datetime strings, to write January 10, 2023 as 2023-01-10. So, if you have any datetime information in your data, an important preparation step is to ensure that all datetimes follow ISO 8601 formatting. 

Many information and metadata standards are specific to one's discipline. Standards can also be applied at both macro and micro levels. Here are a few disciplinary examples:

  • The FITS (Flexible Image Transport System) file format is utilized by astronomers: image data is prepended by a plain text header carrying important metadata about the image provenance. ("Provenance" refers to how the image was produced: instrumentation, when and where the instrument was positioned, etc.) 
  • Gene annotations in genomic databases use a "locus tag" identifier for different contributors to systematically refer to the same genes.
  • Common Data Elements (CDEs) are widely deployed in clinical data to standardize and compare patient variables across studies.

Which standards are relevant to you will depend on your field.

💡 Note: While some standards may be applied after the fact, others will require consideration during your study design and data collection phases. Think about standards early to maximize reproducibility and interoperability.