Open Access
Artificial Intelligence  |   August 2023
Can Glaucoma Suspect Data Help to Improve the Performance of Glaucoma Diagnosis?
Author Affiliations & Notes
  • Ashkan Abbasi
    Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, OR, USA
  • Bhavna Josephine Antony
    Electrical and Computer System Engineering, Faculty of Engineering, Monash University, Clayton, Victoria, Australia
    Department of Infectious Diseases, Alfred Health, Melbourne, Victoria, Australia
  • Sowjanya Gowrisankaran
    Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, OR, USA
  • Gadi Wollstein
    Department of Ophthalmology, NYU Langone Health, New York, NY, USA
    Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY, USA
  • Joel S. Schuman
    Department of Ophthalmology, NYU Langone Health, New York, NY, USA
    Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, NY, USA
    Department of Electrical & Computer Engineering, NYU Tandon School of Engineering, Brooklyn, NY, USA
  • Hiroshi Ishikawa
    Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, OR, USA
    Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA
  • Correspondence: Hiroshi Ishikawa, Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 3375 SW Terwilliger Blvd, Portland, 97239 OR, USA. e-mail: ishikawh@ohsu.edu 
Translational Vision Science & Technology August 2023, Vol.12, 6. doi:https://doi.org/10.1167/tvst.12.8.6
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Ashkan Abbasi, Bhavna Josephine Antony, Sowjanya Gowrisankaran, Gadi Wollstein, Joel S. Schuman, Hiroshi Ishikawa; Can Glaucoma Suspect Data Help to Improve the Performance of Glaucoma Diagnosis?. Trans. Vis. Sci. Tech. 2023;12(8):6. https://doi.org/10.1167/tvst.12.8.6.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: The presence of imbalanced datasets in medical applications can negatively affect deep learning methods. This study aims to investigate how the performance of convolutional neural networks (CNNs) for glaucoma diagnosis can be improved by addressing imbalanced learning issues through utilizing glaucoma suspect samples, which are often excluded from studies because they are a mixture of healthy and preperimetric glaucomatous eyes, in a semi-supervised learning approach.

Methods: A baseline 3D CNN was developed and trained on a real-world glaucoma dataset, which is naturally imbalanced (like many other real-world medical datasets). Then, three methods, including reweighting samples, data resampling to form balanced batches, and semi-supervised learning on glaucoma suspect data were applied to practically assess their impacts on the performances of the trained methods.

Results: The proposed method achieved a mean accuracy of 95.24%, an F1 score of 97.42%, and an area under the curve of receiver operating characteristic (AUC ROC) of 95.64%, whereas the corresponding results for the traditional supervised training using weighted cross-entropy loss were 92.88%, 96.12%, and 92.72%, respectively. The obtained results show statistically significant improvements in all metrics.

Conclusions: Exploiting glaucoma suspect eyes in a semi-supervised learning method coupled with resampling can improve glaucoma diagnosis performance by mitigating imbalanced learning issues.

Translational Relevance: Clinical imbalanced datasets may negatively affect medical applications of deep learning. Utilizing data with uncertain diagnosis, such as glaucoma suspects, through a combination of semi-supervised learning and class-imbalanced learning strategies can partially address the problems of having limited data and learning on imbalanced datasets.

Introduction
Glaucoma is a neuro-degenerative disease and one of the leading causes of blindness worldwide.1 Deep learning approaches have recently been used to diagnose and assess glaucoma with promising results. Fundus images,25 optical coherence tomography (OCT) volumes,69 or thickness maps1014 obtained from OCT volumes are usually utilized for glaucoma assessment through either segmentation-based1013 or segmentation-free69,14 methods. However, regardless of the underlying models and imaging types utilized for diagnosis, in a real-world glaucoma dataset, the number of samples from one class (the majority class, e.g. glaucoma cases) is much higher than the other class (the minority class, e.g. healthy cases), and thus the datasets are imbalanced.59,15 Training models on imbalanced datasets usually leads to inaccurate parameter estimation and generalization failure because the learning algorithm spends most of its time training from majority class samples while underestimating minority class samples. 
Two common ways to handle imbalanced learning are reweighting samples and re-sampling.16,17 In addition, it has recently been demonstrated that using unlabeled data alongside labeled data in a semi-supervised learning approach can improve performance in class imbalanced learning.18 Here, we hypothesized that utilizing the gray zone data of glaucoma suspects as a source of unlabeled data in addition to the typically used fully labeled glaucoma dataset (which includes both healthy and glaucomatous samples) would be beneficial in developing a feature agnostic 3D convolutional neural network (CNN) for glaucoma diagnosis from OCT volumes. 
In reality, glaucoma suspect samples are a mix of healthy and preperimetric glaucoma cases with suspicious optic disc appearance and/or ocular hypertension. Due to the difficulty with accurate early diagnosis, it is a common practice to classify patients as glaucoma suspect when all the diagnosis criteria are not satisfied.19 Although, in a few studies, glaucoma suspect has been treated as a separate class from both healthy and glaucomatous eyes,20,21 in the majority of studies,215,22 glaucoma suspect samples were simply discarded. The former approach retains the distribution of glaucoma and healthy samples, but it cannot help with class imbalanced learning to improve diagnosis performance. The latter approach discards portions of data obtained during clinical studies with the hope of avoiding confounding the data.215,22 In contrast, in this paper, we will show that glaucoma suspect samples can be used to (1) mitigate the limitations of imbalanced learning and (2) increase the overall number of samples. Both of these features are crucial for training deep neural networks. To the best of our knowledge, this is the first study to propose the creative use of glaucoma suspect data in order to maximally exploit all glaucoma-related data in a clinical study for improving the performance of training a deep learning-based glaucoma diagnosis system. 
Materials and Methods
Dataset
Optic nerve head (ONH) centered OCT volumes were captured using spectral-domain OCT devices (Cirrus HD-OCT; Zeiss, Dublin, CA, USA) on patients over multiple visits during 2005 and 2019 at the UPMC Eye Center and the NYU Langone Eye Center. Retinal diseases other than glaucoma and refractive errors greater than +6 or smaller than -6 diopters were excluded. Each OCT volume was obtained by scanning an area of 6 × 6 × 2 mm3 on the retina and storing the result as a 200 × 200 × 1024 (horizontal × vertical × depth) data cube. Data collection was conducted in accordance with the tenets of the Declaration of Helsinki and the Health Insurance Portability and Accountability Act (HIPAA). The Institutional Review Board of New York University and the University of Pittsburgh approved the study, and all subjects gave written consent before participation. 
Our initial dataset consisted of 12,863 OCT volumes (with signal strength >=6) from the eyes of 794 individuals. These scans were labeled either as healthy or as having primary open angle glaucoma (POAG). Patients with at least two consecutively abnormal visual field test results were considered as glaucomatous eyes by specialists, and healthy scans were captured from individuals without a clinical history of glaucoma. Given this dataset, we first removed repeated scans (which were taken at the same date from the same individual) and kept the scans with the highest signal strength. At the end, 8476 scans remained, in which there were 683 healthy and 7793 POAG scans. Therefore, the class imbalance ratio (defined as the ratio of the number of samples in the majority class to the number of samples in the minority class18) is around 12. 
Demographic characteristics of healthy and glaucomatous samples are provided at the second and forth columns of Table 1. We divided all healthy and glaucoma scans into three groups, training, validation, and test subsets containing 5434 (442 healthy scans from 113 individuals), 1348 (105 healthy from 23 individuals), and 1694 (136 healthy from 32 individuals) samples, respectively. We ensured that OCT volumes belonging to the same patient were included in only one of the three splits to avoid data leakage, and the class imbalance ratio was kept fixed in each split. We refer to this dataset as the main dataset. 
Table 1.
 
Demographic Characteristics of the Dataset
Table 1.
 
Demographic Characteristics of the Dataset
In addition to the main dataset, there was an extra set of 8318 OCT volumes from 751 individuals, which were labeled as glaucoma suspect because the visual field test results were normal but at least one of the following criteria was met: intraocular pressure of 22 to 30 mm Hg, abnormal ONH appearance (including but not limited to increased cupping [diffuse or focal narrowing of the disc rim], asymmetric ONH cupping, recurrent disc hemorrhages, large ONH with thin disc rims, and/or anomalous discs); or an eye that was the contralateral eye of unilateral glaucoma. Similar to the main dataset of healthy and POAG scans, we removed the repeated scans from glaucoma suspect samples, and then, the remaining 5648 OCT volumes were kept aside to be used as our additional source of data for semi-supervised training (Proposed Method section). Finally, all OCT volumes were downsampled to 64 × 64 × 256 data cubes using the 3D bicubic interpolation method because training a neural network with data samples in their original size is impractical.69,15 The demographic characteristics of glaucoma suspect samples are presented in the third column of Table 1
Proposed Method
Following,69,15 we first developed a baseline 3D CNN consisting of 7 convolutional layers followed by a global average pooling and a fully connected layer with two neurons (Fig. 1). Each convolution layer has a 3 × 3 convolution kernel with stride 1, batch normalization, and rectified linear unit (ReLU) activation. Next, 3D maximum pooling is used after each convolution layer to gradually reduce the dimensionality of data and extract more semantically meaningful features. The number of feature maps in convolution operators are 16, 16, 32, 32, 32, 64, and 128, respectively. Then, the output of the fully connected layer is followed by a Softmax activation to provide a probability score for each class. 
Figure 1.
 
The 3D CNN architecture used for glaucoma diagnosis from OCT volumes. The numbers in front of the letters k and n indicate kernel size and number of features/neurons, respectively.
Figure 1.
 
The 3D CNN architecture used for glaucoma diagnosis from OCT volumes. The numbers in front of the letters k and n indicate kernel size and number of features/neurons, respectively.
First, to train the 3D CNN architecture in Figure 1 on the main dataset (without glaucoma suspect samples; Dataset section), following,69,15 weighted cross-entropy (WCE) is used. In WCE, samples from each category are weighted by reciprocal of their corresponding class proportions. Thus, the cost of failures on minority (healthy) samples increases. We refer to this method as “3D CNN + WCE.” 
Next, we construct our second method by exploiting resampling (RE),16,17,23 which is a common method to handle imbalanced learning. Specifically, we construct balanced batches during training by uniformly sampling from both classes of the training set.23 In this way, the same number of samples from each class are presented in every training batch, and, thus, the gradients always have useful information about both classes. We refer to this method as “3D CNN + RE.” 
In order to harness the power of all glaucoma data that is usually collected in clinical studies (Dataset section), the third 3D CNN model is trained on a dataset comprising of the main dataset (healthy and glaucomatous samples) and glaucoma suspect samples. We treat glaucoma suspect samples as a source of unlabeled data in a semi-supervised learning (SSL) approach by adding them to the pool of training samples and leaving both validation and test subsets untouched. Pseudo-labels for glaucoma suspect samples are generated using our second model (3D CNN + RE). Obviously, we are more confident about the labels in the main dataset than these pseudo-labels. Therefore, we reduce their weights in the overall loss computations by using LT = L + αLU as the total loss function, where L denotes the cross-entropy loss for the samples from the main dataset, LU denotes the cross-entropy loss for the glaucoma suspect samples, and the weight α is used to control the contribution of loss value from glaucoma suspect samples. We refer to this method as “3D CNN + SSL.” In this method, the hyperparameter α was empirically set to 0.7 by checking the performance on the validation set. It turned out that when it was less than 0.7, the improvement over 3D CNN + RE decreased or became negligible and bigger values resulted in performance decline. 
In addition to the 3 mentioned methods, we also trained 2 combined configurations: (1) “3D CNN + SSL + RE,” in which a 3D CNN is trained on the main dataset with glaucoma suspect samples included while performing the resampling technique to create balanced batches during training, and (2) “3D CNN + SSL + RE + WCE,” in which we also applied weighted cross-entropy along with SSL and resampling by creating balanced batches. Similar to “3D CNN + WCE,” the weight for each sample is computed by the reciprocal of total number of samples in its corresponding class. 
We trained all mentioned methods using stochastic gradient descent with a fixed learning rate and batch size equal to 0.01 and 8, respectively. We avoid changing these hyperparameters in our experiments to decrease the chance of interfering with other factors in improving the performance. During the training, early stopping with a patience of 10 epochs were used and checkpoints were saved to prevent overfitting and selecting the best model. The methods were implemented using Python, Keras with Tensorflow24 backend on a desktop PC with 16 GB of RAM and GPU of NVIDIA GeForce GTX 2080 Ti. 
In the upcoming section, the performances of the methods will be evaluated using accuracy, balanced accuracy, F1 score, and area under the curve of receiver operating characteristic (AUC ROC) metrics. Accuracy is defined as the number of correctly classified samples divided by the total number of samples in the test dataset. Balanced accuracy25 is defined as the arithmetic mean of sensitivity (or recall) and specificity, where sensitivity is the ratio of correct glaucoma predictions to the total number of glaucoma samples and specificity is the ratio of correct healthy predictions to the total number of healthy samples. The F1 score is computed based on the harmonic mean of precision and recall, that is, it can be computed using 2 × precision × recall/(precision + recall), where precision is the ratio of correct glaucoma predictions to the total number of glaucoma predictions. Both balanced accuracy and F1 score are usually reported when the dataset is imbalanced. The F1 score favors classifiers to have a precise prediction on positive class (here, glaucoma class) and balanced accuracy favors classifiers to have good recognition rates on both positive and negative classes. AUC ROC measures the ability of the classifier to distinguish between classes and summarizes the ROC curve. 
Results
To evaluate the compared methods, 5-fold cross-validation have been used, and the average metric results on the test dataset consisting of 1694 OCT volumes (Dataset section) along with 95% confidence intervals are reported in Table 2. Among methods with (w/) or without (w/o) using glaucoma suspect samples, the best performance was shown by 3D CNN + SSL + RE. In addition, the Mann Whitney U test26 was performed to statistically compare the results of the compared methods against the best baseline method without using glaucoma suspect samples, which is 3D CNN + RE. In this test, where the U value is less than the critical value for a 0.05 significance level is considered statistically significant. It can be seen that the proposed 3D CNN + SSL + RE method is the only SSL-based method that not only achieved better averaged results but its results are statistically significantly better than the baseline. 
Table 2.
 
Average and 95% Confidence Interval of Accuracy, F1 Score, and AUC ROC Results for the Compared Methods
Table 2.
 
Average and 95% Confidence Interval of Accuracy, F1 Score, and AUC ROC Results for the Compared Methods
In Table 2, the results show that 3D CNN + SSL was as effective as using weighted cross-entropy (3D CNN + WCE) without glaucoma suspect samples, which is a widely used technique in learning from an imbalanced glaucoma dataset.69,15 However, 3D CNN + RE, which uses data re-sampling, shows better performance than 3D CNN + SSL. Due to observing the effectiveness of this method, we retained the resampling technique and coupled this method with both using glaucoma suspect samples (through SSL) and WCE to practically test their contributions. The results in Table 2 show that the performance was improved by exploiting glaucoma suspect samples along with the resampling technique, which is denoted by 3D CNN + SSL + RE in Table 2. However, adding WCE (3D CNN + SSL + RE + WCE) had a negative effect, probably due to the fact that modifying the weights in the loss function became less important when the number of data was increased by using glaucoma suspect samples and at the same time the mini-batches were resampled in such a way that they contained the same number of samples from each class. It is also worth mentioning that, in our experiments, when no imbalanced learning method was used, the training collapsed and the accuracy of the trained method was simply equal to the proportion of glaucoma (the majority class) samples, that is, the trained method classified every input image as glaucoma and was not able to recognize the healthy cases. 
Discussion
The superiority of 3D CNN + SSL, semi-supervised learning with the help of glaucoma suspect samples, over the common 3D CNN + WCE, using weighted cross-entropy, complies with the findings of Ref. 18, where their reported results showed that SSL was at least as effective as reweighting approaches, such as WCE in learning from imbalanced datasets. However, in their experiments, re-sampling had inferior results in comparison to SSL. Data resampling1618 is a common technique to handle learning on imbalanced datasets, and it is usually applied at the data level by random subsampling. In our experiments, we used a re-sampling technique which has previously been shown to be very effective in facial expression recognition on imbalanced datasets.23 Specifically, in 3D CNN + RE, we created balanced batches to prevent the training batches from being dominated by samples from one class.24 
Although, in our experiments, we have practically shown that glaucoma suspect samples can be used for improving performance, it is worth mentioning that this gain was observed because glaucoma suspect samples are indeed a mixture of healthy and preperimetric glaucoma cases.19 This similarity is important for expecting benefits from using unlabeled data in a SSL approach over an imbalanced dataset.18 To intuitively show this similarity, we projected the feature vectors obtained from the GAP layer at the end of a trained 3D CNN into 2D using t-SNE27 in Figure 2 to visually compare how the network understands glaucoma suspect, healthy, and glaucomatous samples. It is clear that features computed for glaucoma suspect samples are very similar to healthy and glaucomatous ones. Therefore, using glaucoma suspect samples is a way of increasing the amount of similar data, and it partially addresses the problem of limited data in glaucoma diagnosis by exploiting all the data collected in a clinical study. 
Figure 2.
 
Left: The 2D projected features for the validation subset (Dataset section) containing both healthy (H) and glaucomatous (G) samples. Right: Projected features for a random subset of glaucoma suspect samples are overlaid on the healthy and glaucomatous samples. Note that the network features show that glaucoma suspect samples are semantically very close to both healthy and glaucomatous eyes. The horizontal and vertical axes represent the 2D feature-space dimensions.
Figure 2.
 
Left: The 2D projected features for the validation subset (Dataset section) containing both healthy (H) and glaucomatous (G) samples. Right: Projected features for a random subset of glaucoma suspect samples are overlaid on the healthy and glaucomatous samples. Note that the network features show that glaucoma suspect samples are semantically very close to both healthy and glaucomatous eyes. The horizontal and vertical axes represent the 2D feature-space dimensions.
In conclusion, this paper emphasized the importance of addressing class-imbalanced learning for the performance of a deep learning-based glaucoma diagnosis method from OCT volumes. Although many studies discard glaucoma suspect samples,215,22 our experiments showed that the novel usage of glaucoma suspect samples through an SSL approach can partially alleviate imbalanced learning issues by increasing the number of semantically similar training examples. However, the use of glaucoma suspect samples alone is not sufficient, and in order to boost the performance, it is crucial to use the resampling through creating balanced batches to further address imbalanced learning issues and exploit all relevant data collected in a clinical study for training the method. 
Acknowledgments
Supported by the National Institutes of Health (Bethesda, MD) R01EY030929, R01EY013178, and P30 EY010572 core grant, the Malcolm M. Marquis, MD Endowed Fund for Innovation, and an unrestricted grant from Research to Prevent Blindness (New York, NY) to Casey Eye Institute, Oregon Health & Science University. 
Conflict of Interest: J.S. Schuman receives royalties for intellectual property licensed by the Massachusetts Institute of Technology and Massachusetts Eye and Ear Infirmary to Zeiss. 
Disclosure: A. Abbasi, None; B.J. Antony, None; S. Gowrisankaran, None; G. Wollstein, None; J.S. Schuman, Zeiss (R); H. Ishikawa, None 
References
Flaxman SR, Bourne RRA, Resnikoff S, et al. Global causes of blindness and distance vision impairment 1990–2020: a systematic review and meta-analysis. Lancet Glob Health. 2017; 5(12): 1221–1234. [CrossRef]
Sevastopolsky A. Optic disc and cup segmentation methods for glaucoma detection with modification of U-Net convolutional neural network. Pattern Recognition and Image Analysis. 2017; 27(3): 618–624. [CrossRef]
Kim J, Tran L, Chew EY, Antani S. Optic disc and cup segmentation for glaucoma characterization using deep learning. In: IEEE Symposium on Computer-Based Medical Systems. New York, NY: Institute of Electrical and Electronics Engineers Inc.; 2019; 489–494.
Veena HN, Muruganandham A, Senthil Kumaran T. A novel optic disc and optic cup segmentation technique to diagnose glaucoma using deep learning convolutional neural network over retinal fundus images. Journal of King Saud University - Computer and Information Sciences. 2022; 34(8): 6187–6198. [CrossRef]
Zhao R, Chen X, Chen Z, Li S. Diagnosing glaucoma on imbalanced data with self-ensemble dual-curriculum learning. Med Image Anal. 2022; 75: 102295. [CrossRef] [PubMed]
George Y, Antony B, Ishikawa H, Wollstein G, Schuman J, Garnavi R. 3D-CNN for glaucoma detection using optical coherence tomography. In: Singapore: International Workshop on Ophthalmic Medical Image Analysis. 2019; 52–59.
George Y, Antony BJ, Ishikawa H, Wollstein G, Schuman JS, Garnavi R. Attention-guided 3D-CNN framework for glaucoma detection and structural-functional association using volumetric images. IEEE J Biomed Health Inform. 2020; 24(12): 3421–3430. [CrossRef] [PubMed]
Yu HH, Maetschke S, Antony BJ, et al. Estimating visual field functions in glaucoma patients using multi-regional neural networks on OCT images. Invest Ophthalmol Vis Sci. 2019; 60(9): 1462–1462.
Yu HH, Maetschke SR, Antony BJ, et al. Estimating global visual field indices in glaucoma by combining macula and optic disc OCT scans using 3-dimensional convolutional neural networks. Ophthalmol Glaucoma. 2021; 4(1): 102–112. [CrossRef] [PubMed]
Muhammad H, Fuchs TJ, De Cuir N, et al. Hybrid deep learning on single wide-field optical coherence tomography scans accurately classifies glaucoma suspects. J Glaucoma. 2017; 26(12): 1086. [CrossRef] [PubMed]
Thakoor KA, Li X, Tsamis E, Sajda P, Hood DC. Enhancing the accuracy of glaucoma detection from OCT probability maps using convolutional neural networks. In: The 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2019; 2036–2040.
Thakoor KA, Li X, Tsamis E, et al. Strategies to improve convolutional neural network generalizability and reference standards for glaucoma detection from OCT scans. Transl Vis Sci Technol. 2021; 10(4): 16. [CrossRef] [PubMed]
Asaoka R, Murata H, Hirasawa K, et al. Using deep learning and transfer learning to accurately diagnose early-onset glaucoma from macular optical coherence tomography images. Am J Ophthalmol. 2019; 198: 136–145. [CrossRef] [PubMed]
Mariottoni EB, Jammal AA, Urata CN, et al. Quantification of retinal nerve fibre layer thickness on optical coherence tomography with a deep learning segmentation-free approach. Scientific Reports. 2020; 10(1): 1–9. [CrossRef] [PubMed]
Maetschke S, Antony B, Ishikawa H, Wollstein G, Schuman J, Garnavi R. A feature agnostic approach for glaucoma detection in OCT volumes. PLoS One. 2019; 14(7): 0219126. [CrossRef]
Le Borgne YA, Siblini W, Lebichot B, Bontempi G. Reproducible machine learning for credit card fraud detection - practical handbook. Université Libre de Bruxelles. 2022, https://github.com/Fraud-Detection-Handbook/fraud-detection-handbook.
Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F. Learning from imbalanced data sets. Cambridge: Springer; 2018, https://doi.org/10.1007/978-3-319-98074-4.
Yang Y, Xu Z. Rethinking the value of labels for improving class-imbalanced learning. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Advances in Neural Information Processing Systems. New York, NY: Curran Associates, Inc.; 2020; 19290–19301.
Salmon NJ, Terry HP, Farmery AD, Salmon JF. An analysis of patients discharged from a hospital-based glaucoma case-finding clinic over a 3-year period. Ophthalmic and Physiological Optics. 2007; 27(4): 399–403. [CrossRef] [PubMed]
Jun TJ, Kim D, Nguyen HM, Kim D, Eom Y. 2sRanking-CNN: a 2-stage ranking-CNN for diagnosis of glaucoma from fundus images using CAM-extracted ROI as an intermediate input. In: The 29th British Machine Vision Conference, Newcastle, UK, September 3–6. British Machine Vision Association (BMVA) Press; 2018.
Jun TJ, Eom Y, Kim D, et al. TRk-CNN: transferable ranking-CNN for image classification of glaucoma, glaucoma suspect, and normal eyes. Expert Syst Appl. 2021; 182: 115211. [CrossRef]
Li Z, He Y, Keel S, Meng W, Chang RT, He M. Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs. Ophthalmology. 2018; 125(8): 1199–1206. [CrossRef] [PubMed]
Sönmez EB, Cangelosi A. Convolutional neural networks with balanced batches for facial expressions recognition. In: Verikas A, Radeva P, Nikolaev DP, Zhang W, Zhou J, eds. The 9th International Conference on Machine Vision. SPIE; 2017; 103410J.
Abadi M, Barham P, Chen J, et al. TensorFlow: a system for large-scale machine learning. In: The 12th USENIX Conference on Operating Systems Design and Implementation. USENIX Association; 2016.
Brodersen KH, Ong CS, Stephan KE, Buhmann JM. The balanced accuracy and its posterior distribution. In: The 20th International Conference on Pattern Recognition. 2010;3121–3124.
Mann HB, Whitney DR. On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics. 1947; 18(1): 50–60. [CrossRef]
van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008; 9: 2579–2605.
Figure 1.
 
The 3D CNN architecture used for glaucoma diagnosis from OCT volumes. The numbers in front of the letters k and n indicate kernel size and number of features/neurons, respectively.
Figure 1.
 
The 3D CNN architecture used for glaucoma diagnosis from OCT volumes. The numbers in front of the letters k and n indicate kernel size and number of features/neurons, respectively.
Figure 2.
 
Left: The 2D projected features for the validation subset (Dataset section) containing both healthy (H) and glaucomatous (G) samples. Right: Projected features for a random subset of glaucoma suspect samples are overlaid on the healthy and glaucomatous samples. Note that the network features show that glaucoma suspect samples are semantically very close to both healthy and glaucomatous eyes. The horizontal and vertical axes represent the 2D feature-space dimensions.
Figure 2.
 
Left: The 2D projected features for the validation subset (Dataset section) containing both healthy (H) and glaucomatous (G) samples. Right: Projected features for a random subset of glaucoma suspect samples are overlaid on the healthy and glaucomatous samples. Note that the network features show that glaucoma suspect samples are semantically very close to both healthy and glaucomatous eyes. The horizontal and vertical axes represent the 2D feature-space dimensions.
Table 1.
 
Demographic Characteristics of the Dataset
Table 1.
 
Demographic Characteristics of the Dataset
Table 2.
 
Average and 95% Confidence Interval of Accuracy, F1 Score, and AUC ROC Results for the Compared Methods
Table 2.
 
Average and 95% Confidence Interval of Accuracy, F1 Score, and AUC ROC Results for the Compared Methods
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×