Abstract
Purpose:
To develop and assess a deep learning system that automatically detects angle closure and quantitatively measures angle parameters from ultrasound biomicroscopy (UBM) images using a deep learning algorithm.
Methods:
A total of 3788 UBM images (2146 open angle and 1642 angle closure) from 1483 patients were collected. We developed a convolutional neural network (CNN) based on the InceptionV3 network for automatic classification of angle closure and open angle. For nonclosed images, we developed a CNN based on the EfficienttNetB3 network for the automatic localization of the scleral spur and the angle recess; then, the Unet network was used to segment the anterior chamber angle (ACA) tissue automatically. Based on the results of the latter two processes, we developed an algorithm to automatically measure the trabecular-iris angle (TIA500 and TIA750), angle-opening distance (AOD500 and AOD750), and angle recess area (ARA500 and ARA750) for quantitative evaluation of angle width.
Results:
Using manual labeling as the reference standard, the ACA classification network's accuracy reached 98.18%, and the sensitivity and specificity for angle closure reached 98.74% and 97.44%, respectively. The deep learning system realized the automatic measurement of the angle parameters, and the mean of differences was generally small between automatic measurement and manual measurement. The coefficients of variation of TIA500, TIA750, AOD500, AOD750, ARA500, and ARA750 measured by the deep learning system were 5.77%, 4.67%, 10.76%, 7.71%, 16.77%, and 12.70%, respectively. The within-subject standard deviations of TIA500, TIA750, AOD500, AOD750, ARA500, and ARA750 were 5.77 degrees, 4.56 degrees, 155.92 µm, 147.51 µm, 0.10 mm2, and 0.12 mm2, respectively. The intraclass correlation coefficients of all the angle parameters were greater than 0.935.
Conclusions:
The deep learning system can effectively and accurately evaluate the ACA automatically based on fully automated analysis of a UBM image.
Translational Relevance:
The present work suggests that the deep learning system described here could automatically detect angle closure and quantitatively measure angle parameters from UBM images and enhancing the intelligent diagnosis and management of primary angle-closure glaucoma.
UBM images were collected from patients who underwent UBM examinations at the Tianjin Medical University Eye Hospital from May 2014 to February 2021. The UBM equipment was an MD-300L produced by MEDA Co. Ltd. (Tianjin, China), and the ultrasonic probe frequency used was 50 MHz, with a scan depth of 5.5 mm and width of 8.25 mm. It requires patients to be in a reclined position so that a water bath can be placed on the ocular surface for immersion of the probe. Images were excluded due to ACA structural abnormalities caused by iridodialysis, motion artifacts, or incompleteness. A total of 3788 UBM images from 1483 patients were selected from the database consecutively, and each image contained only one ACA. All UBM images were desensitized to personal privacy information before being obtained by researchers. This study was conducted following the World Medical Association Declaration of Helsinki principles and was approved by the Ethics Committee of Tianjin Medical University Eye Hospital (2019KY-24). Since the study was a retrospective study and used desensitized UBM images, informed consent was exempted.
The labeling process of the UBM image was divided into two steps: (1) ophthalmologists classified each UBM image into angle closure or open angle. If the trabecular meshwork touched the iris, it was defined as angle closure. Labeling an image as angle closure did not require identification of the scleral spur since the boundary between the cornea-scleral tissue and the iris was blurred in closed angles.
Figure 1 shows the representative image of open angle and angle closure. (2) For open-angle images, ophthalmologists used LabelMe (Massachusetts Institute of Technology, Cambridge, MA, USA) to mark the scleral spur coordinates, angle recess coordinates, and ACA tissue segmentation.
Figure 2 shows the labeling process.
The training of deep learning systems requires robust reference standards. Two ophthalmologists (each with more than 8 years of clinical experience) classified all images as angle closure or open angle. If their results were the same, this result was accepted as the final result. Otherwise, a senior ophthalmologist with more than 15 years of clinical experience made the final decision. For open-angle images, two ophthalmologists independently marked the scleral spur and the angle recess. The average value of the marked coordinates was used as the reference standard, and the senior ophthalmologist checked and corrected it. Likewise, the two ophthalmologists marked the ACA tissue for the open-angle image, and the senior ophthalmologist checked and corrected it.
Using manual labeling as the reference standard, the performance of the classification model was assessed by accuracy, sensitivity, and specificity; the performance of the localization model was assessed by calculating the Euclidean distance between the model-predicted coordinates and the labeled coordinates; the performance of the segmentation model was assessed by pixel accuracy (PA; indicates the proportion of correct segmentation pixels to the total number of pixels) and mean intersection over union (mIOU; indicates the intersection of predicted ACA tissue and manual annotation ACA tissue divided by their union).
To assess the consistency of the measurements between the ophthalmologists and the deep learning system, we calculated the interobserver reproducibility (within-subject standard deviation), coefficient of variation (CV; within-subject standard deviation divided by the overall mean), intraclass correlation coefficient (ICC), and limits of agreement based on the angle parameters measured by the ophthalmologists and automatically measured by the deep learning system. In the assessment, P values less than or equal to 0.05 were considered significant.
In total, 185 images were excluded due to ACA structural abnormalities caused by iridodialysis (65 images), motion artifacts (18 images), or incompleteness (102 images). The final data set contained 3788 UBM images with 2146 open-angle and 1642 angle-closure images from 1483 patients. The training set, validation set, and testing set were split randomly at the patient's level so that images from a single patient were only included in the testing or training/validation sets (sample size, training set/validation set/testing set = 6:2:2). This operation is essential to prevent data leakage. During the ACA classification task, 2267 images (1285 open-angle and 982 angle-closure images) were assigned to the training set, 760 images (434 open-angle and 326 angle-closure images) were assigned to the validation set, and 761 images (427 open-angle and 334 angle-closure images) were assigned to the testing set. Using the manual classification as the reference standard, we found that the classification accuracy reached 98.18%, and the sensitivity and specificity reached 98.74% and 97.44% for angle closure.
During the scleral spur and angle recess localization task and ACA tissue segmentation task, 1285 open-angle images were assigned to the training set, 434 open-angle images were assigned to the validation set, and 427 open-angle images were assigned to the testing set. Using coordinates marked by the ophthalmologists as the reference standard, we found that the mean Euclidian distance of the scleral spur localization model was 65.19 ± 51.47 µm. The Euclidean distance distribution of the model was 5.62% within 10 µm, 50.82% within 50 µm, 80.80% within 100 µm, and 92.74% within 150 µm.
Figure 4 shows representative images of various Euclidean distances between the scleral spur locations marked by ophthalmologists and predicted by the deep learning model. Similarly, we found that the mean Euclidian distance of the angle recess localization model was 43.32 ± 41.23 µm. The Euclidean distance distribution of the model was 9.13% within 10 µm, 74.00% within 50 µm, 94.38% within 100 µm, and 97.19% within 150 µm. There were no statistically significant differences in the localization error distributions of scleral spur and angle recess at different angle widths (Mann–Whitney
U test,
P > 0.05), and no association was found between angle width and localization error. Using the manual segmentation as the standard, we found that the PA and mIOU of the deep learning segmentation model reached 98.94% and 97.11%, respectively.
In this study, we developed and assessed a deep learning system composed of multilevel CNNs for automatic assessment of ACA. The results suggested that the artificial intelligence system can automatically classify UBM images into angle closure (iridotrabecular contact) and open angle with high accuracy (ACC = 98.18%). This deep learning system's automatic measurement of angle parameters such as TIA, AOD, and ARA is in good agreement with the manual measurement of open-angle images. We believe that this automatic ACA assessment system will facilitate the development of intelligent diagnostic systems for PACG and enhance the application of UBM imaging in clinical care and scientific research in PACG.
Several studies have reported automated ACA assessment. UBM Pro 2000 (Paradigm Medical Industries, Salt Lake City, UT, USA) is a program for ACA quantization based on UBM images.
30 However, this program requires the user to recognize the ACA and adjust the image contrast, which may increase the measurement differences between different observers. The Zhongshan Angle Assessment Program proposed a quantitative assessment method of ACA based on anterior segment optical coherence tomography (AS-OCT) images,
31 but this method requires the operator to determine the location of the scleral spur. Lin et al.
32 developed software for measuring angle parameters and iris parameters based on UBM images, but this method requires the operator to locate the scleral spur and other anatomic reference points. In these studies, users must manually identify specific anatomic structures as reference points for automatic ACA assessment. These semiautomated methods introduce user subjectivity. Manual identification of anatomic structures depends on the clinical experience of the user. Different operators may use different criteria to locate the reference points. Even under the same localization criteria, there will be some localization errors due to image resolution and contrast limitation. Image analysis based on computer vision and deep learning may be an effective solution for automatic quantitative assessment of ACA in these instances. We realized the automatic location of scleral spur and angle recess and the automatic segmentation of the ACA tissue based on a deep learning algorithm. The deep learning–based ACA automatic assessment system we proposed is fully automatic without any manual intervention. For the same input image, this deep learning system can always output the same angle parameters. Although the deep learning system eliminates interobserver and intraobserver errors in measuring a single UBM image, factors such as the experience of the UBM operator and the scanning position of the ultrasonic probe on the eyeball will still affect the reproducibility of the angle parameters measurement. Because the objective of this study was to design a deep learning system for automatic assessment of ACA in UBM images, clinical information about whether patients with prior surgery or laser treatment and whether they had secondary angle closure was not recorded.
The ICC values between manual measurement and automatic measurement of TIA, AOD, and ARA were all greater than 0.935, and the ICC value of TIA was greater than 0.985. The CV values of AOD500, AOD750, and ARA750 were 10.76%, 7.71%, and 12.70%, respectively. The CV values of TIA500 and TIA750 were 5.67% and 4.67%, respectively, and the reproducibility of TIA500 and TIA750 was 5.77 degrees and 4.56 degrees, respectively. For comparison, the CV values of AOD500, AOD750, and ARA750 achieved by the Zhongshan Angle Assessment Program were 17.6%, 12.8%, and 14.9%, respectively.
31 Lin et al.
32 described automatic measurement of angle parameters using UBM images and reported that the ICC ranges of TIA, AOD, and ARA were 0.60 to 0.92, 0.52 to 0.89, and 0.64 to 0.92, respectively. Li et al.
33 only realized the automatic prediction of TIA, with an ICC of 0.95, a CV of 6.8%, and a reproducibility of 6.1 degrees. Compared to the abovementioned systems, our deep learning system achieved better consistency with the manual measurement results.
By analyzing the angle parameters, we found that the consistency of TIA measured by the deep learning system was better than AOD and ARA. Accurate measurement of angle parameters relies on precise localization of the scleral spur.
Table 2 summarizes the relationship between the angle parameter measurement error and the scleral spur localization error. Compared with the measurement errors of TIA and AOD, the ARA measurement error had the most significant correlation with the scleral spur localization error (
R2 = 0.446,
P < 0.0001), which may have been because the ARA measurement was greatly affected by the scleral spur localization error but was insensitive to irregular iris anterior surfaces. The TIA measurement error and the scleral spur localization error were the least significant because the TIA measurement was affected by the scleral spur position, the irregular iris anterior surface, and the angle recess position. The relationship between angle parameter measurement error and scleral spur location error largely explains the better consistency of TIA than AOD and ARA. The within-subject standard deviation for AOD 500 and AOD 750 may be relatively large in the angle region. The reason for this may be due to changes in the morphology of the iris.
Our proposed deep learning system can automatically detect the angle closure and quantitatively measure angle width in UBM images. Studies have shown that some patients with PACG are not aware of their condition before the onset of the disease, which may produce critical visual damage to the patients.
34 Therefore, using UBM imaging and the deep learning system can help screen people at high risk for PACG to enable intervention protecting against vision loss. The deep learning system can also dynamically measure the angle parameters of the patient and monitor the changing trend of the angle parameters. The angle parameters before and after the treatment automatically measured by the deep learning system can assist ophthalmologists in evaluating the treatment effect. Because people in underdeveloped areas have limited access to UBM experts with rich clinical experience, the system can also be combined with telemedicine to help to decide who should be referred for further evaluation and treatment.
Our study also presents some limitations. The first limitation is the lack of absolute ground truth in UBM image annotation. Since the UBM image labeling process is subjective, human errors may be introduced in the labeling process. When the deep learning model learns according to the labeling results of ophthalmologists, it will inevitably learn from human errors. Some studies have pointed out that with the increase of annotation experts, the objectivity of annotation results will also increase, and the error will be concentrated around zero.
35 Therefore, adding labeling experts may be an effective way to alleviate the absence of absolute ground truth. The second limitation is that all the UBM images in the data set are from the same UBM device. Different UBM devices may have different image sizes and resolutions, which will affect the assessment of the ACA. Therefore, the deep learning system proposed in this study does not apply to other UBM devices. A third limitation is that all UBM images in the data set are from Chinese people, so our results may not apply to other ethnic groups. In this study, although our data set contains many UBM images from real-world clinical settings, the generality of our findings should be treated with caution due to the lack of validation of external data sets. A recent study has shown that poor-quality images often have a negative influence on image-based artificial intelligence systems.
36 Therefore, future work should assess the automatic recognition of poor-quality images to ensure that the performance of the deep learning system is not affected by poor-quality images.
In summary, our proposed automatic ACA assessment system based on a deep learning algorithm realizes reliable and repeatable angle-closure detection and automatic measurement of angle parameters. In the future, more studies are needed to evaluate the clinical performance of this system and compare it to clinical assessments without the use of artificial intelligence.
Supported by CAMS Initiative for Innovative Medicine (2017-12M-3-020) and Key Technologies R&D program of Tianjin (19YFZCSY00510). The funding organization had no role in the design or conduct of this research.
Disclosure: W. Wang, None; L. Wang, None; X. Wang, None; S. Zhou, None; S. Lin, None; J. Yang, None