Skip to main content

Table 1 PHI distributions in the 2014 i2b2/UTHealth de-identification corpus and UF Health clinical notes

From: A study of deep learning methods for de-identification of clinical notes in cross-institute settings

PHI Category

Number of Annotations

2014 i2b2/UTHealth

UF Heath

Training

Validation

Training

Validation

Test

DATE

9067

3104

2056

774

1872

NAME

5472

1868

856

356

771

AGE

1507

490

158

86

164

ID

1142

364

156

41

137

PHONE

406

128

50

28

47

WEB

6

1

0

0

4

INSTITUTE

1926

592

128

72

119

STREET

280

72

25

6

21

CITY

502

152

43

26

45

ZIP

276

76

34

11

20

Total

20,584

6847

3506

1400

3200