Data Quality Audits

Accurate data are essential to maintaining the integrity of research. In order for an integrated data system to produce the most dependable results, the governing board overseeing the IDS must establish appropriate methods for assessing the reliability and validity of data elements to maximize the utility of the information they contain. Below lists several established procedures researchers can use to evaluate the data reliability:

  • Variable-level auditing – looks for out-of-range codes or codes that may have changed over time
  • Reliability measures – variables that are scored with reliability measures such that external requestors are aware of the reliability of a given variable
  • Establishing common audit routines – helps measure the completeness of a given variable (degree of missing data), the accuracy (the proportion of valid codes), and coverage (gaps in time periods reported, or providers reporting, etc.).
  • Reliability and validity testing are important data-auditing tasks for evaluating the scientific capacity of data to be included in the IDS. It ensures that data collected on a variable actually represent the phenomenon in question. Because reliability and validity testing can be time-consuming, it is important for IDS leadership to partner with data-sharing agencies to periodically seek funding to accomplish these important audits. When two data sources are available for a given measure (e.g., diagnosis associated with a hospitalization), the redundant data sources can be compared to assess the degree of agreement between them. This can include comparing redundant measures of the same variable from different databases, and can also include sampling charts or other documentation to compare it to the electronic records. Although time consuming, charts can be reviewed for their comparability with administrative records on a periodic basis.  Administrators may want to include data accuracy and completeness as a performance measure for service contracts with providers and other agencies as a way of improving data quality.