Skip to main content

Table 2 Parameters used in different machine learning models after grid search optimization

From: O2 supplementation disambiguation in clinical narratives to support retrospective COVID-19 studies

Classifier

Parameters - Values

SVM

vectorisor - TF-IDF vectorisor (in a range of 2450 to 2520 dimensions)

kernel - rbf (Radial Basis Function) kernel

regularisation parameter - C value of 10

cross-validation - 5 fold

RF

vectorisor - TF-IDF vectorisor (in a range of 2450 to 2520 dimensions)

maximum features- square root of the total number of features

number of decision trees - 100

cross-validation - 5 fold

LSTM & Bi-LSTM

Embedding layer - 300 dimensional

LSTM / Bi-LSTM layer - 128 nodes

dropout and recurrent dropoutlayer -probability of 0.2

dense output layer - 1 node

activation layer - sigmoid

cross-validation - 5 fold

loss - binary cross entropy

optimizer - adamax

CNN

Embedding layer - 300 dimensional

1D convolutional layer with:

- filters - 256

- window size - 5

- activation layer - relu

dropout layer - probability of 0.5

dense output layer - 1 node

activation layer - sigmoid

cross-validation - 5 fold

loss - binary cross entropy

optimizer - adamax