Purchase this article with an account.
Lutfiah Al Turk, Darina Georgieva, Hassan Alsawadi, Su Wang, Paul Krause, Hend Alsawadi, Abdulrahman Zaid Alshamrani, George M. Saleh, Hongying Lilian Tang; Learning to Discover Explainable Clinical Features With Minimum Supervision. Trans. Vis. Sci. Tech. 2022;11(1):11. doi: https://doi.org/10.1167/tvst.11.1.11.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
To compare supervised transfer learning to semisupervised learning for their ability to learn in-depth knowledge with limited data in the optical coherence tomography (OCT) domain.
Transfer learning with EfficientNet-B4 and semisupervised learning with SimCLR are used in this work. The largest public OCT dataset, consisting of 108,312 images and four categories (choroidal neovascularization, diabetic macular edema, drusen, and normal) is used. In addition, two smaller datasets are constructed, containing 31,200 images for the limited version and 4000 for the mini version of the dataset. To illustrate the effectiveness of the developed models, local interpretable model-agnostic explanations and class activation maps are used as explainability techniques.
The proposed transfer learning approach using the EfficientNet-B4 model trained on the limited dataset achieves an accuracy of 0.976 (95% confidence interval [CI], 0.963, 0.983), sensitivity of 0.973 and specificity of 0.991. The semisupervised based solution with SimCLR using 10% labeled data and the limited dataset performs with an accuracy of 0.946 (95% CI, 0.932, 0.960), sensitivity of 0.941, and specificity of 0.983.
Semisupervised learning has a huge potential for datasets that contain both labeled and unlabeled inputs, generally, with a significantly smaller number of labeled samples. The semisupervised based solution provided with merely 10% labeled data achieves very similar performance to the supervised transfer learning that uses 100% labeled samples.
Semisupervised learning enables building performant models while requiring less expertise effort and time by using to good advantage the abundant amount of available unlabeled data along with the labeled samples.
This PDF is available to Subscribers Only