Skip to main content

Table 1 The details of Kahn’s DQA framework

From: Application of openEHR archetypes to automate data quality rules for electronic health records: a case study

Definition of assessment dimensions

Sub-dimension

Definition

Conformance: whether data value fulfills certain standards and formats

Value conformance: data value conforms to prespecified data types, data domain, allowable values, value sets, or terminology standards

Data values conform to internal formatting constraints

Data values conform to allowable values or ranges

Relational conformance: data value conforms to relational constraints imposed by physical database structure

Data values conform to relational constraints

Unique (key) data values are not duplicated

Changes to the data model or data model versioning

Computational conformance: calculated value is consistent with technical functional specification

Computed values conform to computational or programming specifications

Completeness: features that describe the frequencies of data attributes present in a data set without reference to data values

The absence of data values at a single moment in time agrees with local or common expectations

The absence of data values measured over time agrees with local or common expectations

Plausibility: features that describe the believability or truthfulness of data values

Uniqueness plausibility: objects appear multiple times are not duplicate or cannot be distinguished

Data values that identify a single object are not duplicated

Atemporal plausibility: observed data values, distributions, or densities agree with local or “common” knowledge (Verification) or from comparisons with external sources that are deemed to be trusted or relative gold standards (Validation)

Data values and distributions agree with an internal measurement or local knowledge

 

Data values and distributions for independent measurements of the same fact are in agreement

 

Logical constraints between values agree with local or common knowledge (includes “expected” missingness)

 

Values of repeated measurement of the same fact show expected variability

 

Temporal plausibility: time-varying variables change values as expected based on known temporal properties or across one or more external comparators or gold standards

Observed or derived values conform to expected temporal properties

 

Sequences of values that represent state transitions conform to expected properties

 

Measures of data value density against a time-oriented denominator are expected based on internal knowledge