Pitt community: write to Digital Scholarship Services or use our AskUs form
Pitt health sciences researchers: contact Data Services, Health Sciences Library System
Dominic Bordelon (dbordelon@pitt.edu) and Rachel Starry (ras545@pitt.edu)
"Data Management @ Pitt" by University of Pittsburgh Library System is licensed for reuse under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Data quality is, simply put, "data that are fit for use by data consumers" (Wang and Strong 1996)—note that "perfection" is not a goal. Many frameworks for data quality have been proposed, such as Total Data Quality Management (TDQM) and Data Quality Assessment (DQA), as well as frameworks in specific domains such as health care or Web data. Different frameworks emphasize different dimensions (characteristics/attributes) of the data. These dimensions are "highly context dependent and their relevancy and importance can vary between organizations and types of data" (Cichy and Rass 2019). Among many frameworks, Cichy and Rass identify the following dimensions as the most common in defining data quality:
These dimensions give us areas of focus when assessing and attempting to improve data quality. Consequences of poor data quality may include inaccuracy, greater uncertainty, and even misguided decision making.