Skip to main content

Table 5 Improvement in predictive ability of data cleaning techniques

From: The effect of data cleaning on record linkage quality

  Hospital admissions data Synthetic data
Remove punctuation a0.08% +0.08%
Remove alt. missing values +0.5% 0%
Nickname lookup −28% −33%
Sex Imputation NA −5%
  1. a Negative sign (-) refers to decrease in predictive ability, positive sign (+) refers to increase in predictive ability compared to baseline.