From: Improved prostate cancer diagnosis using a modified ResNet50-based deep learning architecture
Input layer: The prostate dataset, the weights wk, λa, λs, learing rate η, weight decay γ, other SGD and ADAM paramiters |
---|
Stage 1—Residual Blocks |
 1. Residual block 1 (Bottleneck): |
  1. Convolutional layer: 64 filters, kernel size 1 × 1 |
  2. Batch normalization layer |
  3. ReLU activation layer |
  4. Convolutional layer: 64 filters, kernel size 3 × 3 |
  5. Batch normalization layer |
  6. ReLU activation layer |
  7. Convolutional layer: 256 filters, kernel size 1 × 1 |
  8. Batch normalization layer |
  9. Shortcut connection |
  10. ReLU activation layer |
  11. Repeat step 6 for residual blocks 2 and 3 |
Stage 2—Residual Blocks |
 12. Residual block 4 (Bottleneck): |
• Same as step 6, but with stride 2 in the second convolutional layer and 128 filters instead of 64 |
 13. Repeat step 6 for residual blocks 5, 6, and 7, but with 128 filters in the first and second convolutional layers |
Stage 3—Residual Blocks |
 14. Residual block 8 (Bottleneck): |
• Same as step 6, but with stride 2 in the second convolutional layer, 256 filters in the first and second convolutional layers, and 1024 filters in the third convolutional layer |
 15. Repeat step 6 for residual blocks 9–15, but with 256 filters in the first and second convolutional layers, and 1024 filters in the third convolutional layer |
Stage 4—Residual Blocks |
 16. Residual block 16 (Bottleneck): |
• Same as step 6, but with stride 2 in the second convolutional layer, 512 filters in the first and second convolutional layers, and 2048 filters in the third convolutional layer |
 17. Repeat step 6 for residual blocks 17 and 18, but with 512 filters in the first and second convolutional layers, and 2048 filters in the third convolutional layer |
 18. Region Proposal Network (RPN) layer |
 19. RPN classification layer |
 20. RPN regression layer |
 21. RoIAlign layer |
 22. Convolutional layer with 1024 filters and a kernel size of 3 × 3 |
 23. Mask classification layer |
 24. Mixed optimizer: |
  1. Adam for first 10 epochs: learning rate 0.001 |
  2. SGD for remaining epochs: learning rate 0.01 |
 25. For each batch: |
  1.Update weights with mixed optimizer: |
  1. Compute Adam update: dk, ηa = ∆Adam(wk, ∇, η, γ, …) |
  2. Compute SGD update: vnk = ∆SGD(wk, ∇, γ, …) |
  3. Compute mixed update: Mixed = λs · vnk + λa · dk |
  4. Compute mixed learning rate: ηm = λs · η + λa · ηa |
  5. Update weights: wk + 1 = wk − ηm · combined End For |
 27. Output layer |