From: An evaluation of time series summary statistics as features for clinical prediction tasks
Divide data into k=5 folds |
for k=1 to 5 |
Assign |
A = test data (1 fold reserved for random forest) |
B = train data (3 folds train for random forest) |
C = validation data (1 fold validation for random forest) |
Repeat for train and validation data |
step 1: Encode features as binary chromosomes |
step 2: Generate a population of 20 chromosomes randomly |
step 3: Evaluate AUROC of random forest algorithm for step 2 |
step 4: Determine if termination conditions are met |
if yes: |
Terminate |
else: |
step 5.1: Apply Single point crossover with probability |
of 0.6 |
step 5.2: Apply uniform mutation with probability of 0.1 |
step 5.3: Calculate AUROC of new chromosomes by |
random forest and compare it with step 3 |
step 5.4: Select best chromosomes with highest fitness |
step 5.5: Replace chromosomes with lowest fitness, |
back to step 4 |
Train random forest with data (B+C) based on statistics obtained by |
the genetic algorithm |
Test random forest with data (A) |
Calculate AUROC for fold k |
End for |
Calculate average AUROC for 5 folds |