Pitt community: write to Digital Scholarship Services or use our AskUs form
Pitt health sciences researchers: contact Data Services, Health Sciences Library System
Dominic Bordelon (dbordelon@pitt.edu) and Rachel Starry (ras545@pitt.edu)
"Data Management @ Pitt" by University of Pittsburgh Library System is licensed for reuse under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
In an academic research context, data management is primarily about how individual and small-team researchers manage their data, and computing activities generally, in order to create impactful findings. For findings to have maximum impact, underlying data should be well organized, well described, and available to other researchers for validation and replication. Of course, some also argue that findings won't be made in the first place, or a research program sustained, without effective data management practices. Therefore, research data management tends to encompass the following areas of activity:
Data managers and/or data curators working in a research center or data center, either as part of a larger research team or as institutional staff, will additionally be thinking about providing infrastructure, search and retrieval, and the long-term stewardship of submitted files (e.g., defending against degradation or "bit rot").
In an industry context, data management has many of the same concerns but at an enterprise scale. The team of "collaborators" includes very many producers and consumers of ever-evolving and under-documented data, so specialized roles have emerged to manage the data.
The term data management will also have a different meaning for IT professionals depending on their location in the technology stack; for example, "data management" will have different, specific meanings for a database administrator, a firmware engineer, or a network administrator.
Research data at Pitt are currently governed by Interim Policy RI 14 (October 16, 2023). The policy defines several terms, including Data Sharing, Research Outputs, and the Research Record. The policy establishes terms of data ownership, terms for storage and access, and rules associated with transfer and retention of data. It is recommended that all PIs become familiar with this policy. The full policy is available at the Office of Policy Development and Management.
In the topic of research data management, a typically discussed framework or model is the "Research Data Lifecycle." The lifecycle has us think about data—as a primary research output—from its planning and inception, through usage, and finally dissemination and preservation:
Benefits of this model are that we are primed to plan ahead for information and resources that will be needed later in the lifecycle; that the data do not simply "die" after their collector publishes; and that preservation is not so easily neglected, which can happen when we're overly focused on the processing and analysis steps, for example.
The Office of Institutional Research collects and analyzes information about the University that supports the data-driven decisions of the University’s executive administration, the University community, and external agencies. OIR produces Pitt's Interactive Fact Book and Common Data Set and is responsible for reporting demographic, financial, and faculty activity information.
Academic analytics are the focus of the Office of the Provost's Data Analytics Team. Through business intelligence reports and predictive analytics models, the team supplies senior administrators with the information they need to make data-driven decisions. The team produces a variety of weekly, term-based, and annual reports and dashboards using data characterizing students, faculty, and staff; conducts ad hoc analytics projects to address specific questions; and oversees the administration and analysis of a number of University-wide surveys of students and faculty.
Access to University data is controlled by Pitt IT and may be requested via the appropriate data request form(s).