November 2021
Volume 10, Issue 13
Open Access
Articles  |   November 2021
Predicting 10-2 Visual Field From Optical Coherence Tomography in Glaucoma Using Deep Learning Corrected With 24-2/30-2 Visual Field
Author Affiliations & Notes
  • Yohei Hashimoto
    Department of Ophthalmology, The University of Tokyo, Tokyo, Japan
  • Taichi Kiwaki
    Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
  • Hiroki Sugiura
    Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
  • Shotaro Asano
    Department of Ophthalmology, The University of Tokyo, Tokyo, Japan
  • Hiroshi Murata
    Department of Ophthalmology, The University of Tokyo, Tokyo, Japan
    Department of Ophthalmology, National Center for Global Health and Medicine, Tokyo, Japan
  • Yuri Fujino
    Department of Ophthalmology, The University of Tokyo, Tokyo, Japan
    Department of Ophthalmology, Shimane University Faculty of Medicine, Izumo, Japan
  • Masato Matsuura
    Department of Ophthalmology, The University of Tokyo, Tokyo, Japan
  • Atsuya Miki
    Department of Ophthalmology, Osaka University Graduate School of Medicine, Osaka, Japan
  • Kazuhiko Mori
    Department of Ophthalmology, Kyoto Prefectural University of Medicine, Kyoto, Japan
  • Yoko Ikeda
    Department of Ophthalmology, Kyoto Prefectural University of Medicine, Kyoto, Japan
    Oike-Ganka Ikeda Clinic, Kyoto, Japan
  • Takashi Kanamoto
    Hiroshima Memorial Hospital, Hiroshima, Japan
  • Junkichi Yamagami
    Department of Ophthalmology, JR General Hospital, Tokyo, Japan
  • Kenji Inoue
    Inouye Eye Hospital, Tokyo Japan
  • Masaki Tanito
    Department of Ophthalmology, Shimane University Faculty of Medicine, Shimane, Japan
  • Kenji Yamanishi
    Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan
  • Ryo Asaoka
    Department of Ophthalmology, The University of Tokyo, Tokyo, Japan
    Department of Ophthalmology, Seirei Hamamatsu General Hospital, Shizuoka, Japan
    Seirei Christopher University, Shizuoka, Japan
    Nanovision Research Division, Research Institute of Electronics, Shizuoka University, Shizuoka, Japan
    The Graduate School for the Creation of New Photonics Industries, Shizuoka, Japan
  • Correspondence: Ryo Asaoka, Department of Ophthalmology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan. e-mail: ryoasa0120@mac.com 
Translational Vision Science & Technology November 2021, Vol.10, 28. doi:https://doi.org/10.1167/tvst.10.13.28
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Yohei Hashimoto, Taichi Kiwaki, Hiroki Sugiura, Shotaro Asano, Hiroshi Murata, Yuri Fujino, Masato Matsuura, Atsuya Miki, Kazuhiko Mori, Yoko Ikeda, Takashi Kanamoto, Junkichi Yamagami, Kenji Inoue, Masaki Tanito, Kenji Yamanishi, Ryo Asaoka; Predicting 10-2 Visual Field From Optical Coherence Tomography in Glaucoma Using Deep Learning Corrected With 24-2/30-2 Visual Field. Trans. Vis. Sci. Tech. 2021;10(13):28. https://doi.org/10.1167/tvst.10.13.28.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: To investigate whether a correction based on a Humphrey field analyzer (HFA) 24-2/30-2 visual field (VF) can improve the prediction performance of a deep learning model to predict the HFA 10-2 VF test from macular optical coherence tomography (OCT) measurements.

Methods: This is a multicenter, cross-sectional study. The training dataset comprised 493 eyes of 285 subjects (407, open-angle glaucoma [OAG]; 86, normative) who underwent HFA 10-2 testing and macular OCT. The independent testing dataset comprised 104 OAG eyes of 82 subjects who had undergone HFA 10-2 test, HFA 24-2/30-2 test, and macular OCT. A convolutional neural network (CNN) DL model was trained to predict threshold sensitivity (TH) values in HFA 10-2 from retinal thickness measured by macular OCT. The predicted TH values was modified by pattern-based regularization (PBR) and corrected with HFA 24-2/30-2. Absolute error (AE) of mean TH values and mean absolute error (MAE) of TH values were compared between the CNN-PBR alone model and the CNN-PBR corrected with HFA 24-2/30-2.

Results: AE of mean TH values was lower in the CNN-PBR with HFA 24-2/30-2 correction than in the CNN-PBR alone (1.9dB vs. 2.6dB; P = 0.006). MAE of TH values was lower in the CNN-PBR with correction compared to the CNN-PBR alone (4.2dB vs. 5.3 dB; P < 0.001). The inferior temporal quadrant showed lower prediction errors compared with other quadrants.

Conclusions: The performance of a DL model to predict 10-2 VF from macular OCT was improved by the correction with HFA 24-2/30-2.

Translational Relevance: This model can reduce the burden of additional HFA 10-2 by making the best use of routinely performed HFA 24-2/30-2 and macular OCT.

Introduction
Glaucoma is characterized by progressive visual field (VF) damage and is the leading cause of irreversible blindness in the world.1 Glaucomatous VF deterioration is accompanied by structural changes such as ganglion cell death and loss of axons.24 Structural damage can be detected by optical coherence tomography (OCT),5 and researchers have developed models to discriminate between glaucomatous and healthy OCT-imaged eyes using machine learning algorithms such as support vector machine (SVM),6 random forests,7 and deep learning (DL).8,9 Recent research has also demonstrated the potential for DL models to predict VF sensitivity from OCT images in patients with glaucoma.1014 
Patients suspected of having glaucoma are almost always tested with static automated perimetry using test points that are spaced 6° apart such as the Humphrey field analyzer (HFA) 24-2/30-2 test.15,16 This is because glaucomatous VF damage usually starts as Bjerrum scotoma or nasal step, which often appear within the central 30°.17 Recent studies have revealed that the measurement of the central 10° VF is, however, essential for the accurate assessment of glaucomatous damage.18 In particular, vision-related quality of life is more closely associated with the 10° VF than with the 24° or 30° VF. The problem is that it is costly and time consuming to perform 10° VF test in addition to a 24° or 30° VF test.19 This emphasizes the importance of OCT imaging, because it has the potential to reduce the number of VF measurements necessary to accurately monitor disease progression. 
We recently reported two models to predict visual sensitivities of the HFA 10-2 test in a pointwise manner from spectral domain OCT (SD-OCT) in glaucoma patients: (1) a DL model using a convolutional neural network (CNN) with pattern-based regularization (PBR) [CNN-PBR],11 showing good predictive performance of absolute error [AE] of the whole VF, 2.7 dB and pointwise mean absolute error [MAE], 5.5 dB; and (2) a DL model with correction by HFA 24-2/30-2 test results of the same eye, improving MAE from between 9.4 and 9.5 dB to 5.4 dB.20 In the current study, we combined these two models and investigated whether it was beneficial to correct CNN-PBR-predicted 10-2 visual sensitivities using HFA 24-2/30-2 test results of the same eye. 
Methods
This study was approved by the Research Ethics Committee of the Graduate School of Medicine and Faculty of Medicine at the University of Tokyo, Inouye Eye Hospital, Kyoto Prefectural University of Medicine, Oike-Ganka Ikeda Clinic, JR Tokyo general hospital, Hiroshima Memorial Hospital, Osaka University Graduate School of Medicine, and Shimane University Faculty of Medicine. Informed consent for storing their data in the hospital database for the research purposes was obtained from all patients. This study was performed according to the tenets of the Declaration of Helsinki. 
Training Dataset
Generally, a large amount of paired data of VF and OCT are required for training, but a small amount of paired data were available in the current study, because VF testing was performed more often than OCT scan in our clinical setting. We used the paired and no-paired datasets in the CNN-PBR model by first learning the patterns from nonpaired data (VF data alone) using an unsupervised method and then regularize (i.e., PBR) the prediction by the CNN referring to the obtained patterns. The details have been described elsewhere.12 
The training dataset comprised 493 eyes of 285 subjects (407 eyes with open angle glaucoma [OAG] and 86 normative eyes). Subjects had undergone HFA 10-2 VF testing and OCT imaging. A second training dataset included 7715 HFA 10-2 VF tests that were not paired with SD-OCT images, which came from patients with glaucoma other than the paired dataset. All subjects underwent complete ophthalmic examinations, including biomicroscopy, gonioscopy, intraocular pressure measurement, fundoscopy, refraction, best-corrected visual acuity measurement, and axial length measurements. Patients were enrolled during the period between April 2013 and August 2016 at the University of Tokyo Hospital, Inoue Eye Hospital, JR Tokyo General Hospital, and Hiroshima Memorial Hospital. 
OAG was defined as follows: (1) presence of typical glaucomatous changes such as a disc rim notch and a retinal nerve fiber layer defect identified by ophthalmoscopy or fundus photography; (2) gonioscopically wide open angles of grade 3 or 4 based on the Shaffer classification; (3) visual acuity ≧ 0.5 LogMAR; (4) refractive error < +3.0 diopter; and (5) age 20 to 80 years old. Patients with ocular diseases that could affect the results of SD-OCT examinations and VF testing, such as diabetic retinopathy or age-related macular degeneration, were carefully excluded. Eyes with clinically significant senile cataract were also excluded. 
Normative eyes were defined as follows: (1) no abnormal findings except for clinically insignificant senile cataract; (2) no history of ocular diseases that could affect the results of SD-OCT and VF examinations; (3) normal VF test results according to the Anderson-Patella criteria; (4) refractive error < +3.0 diopter; and (5) age 20 to 80 years, 
Testing Dataset
The testing dataset was independent of the training dataset. It comprised 104 OAG eyes of 82 subjects. Inclusion and exclusion criteria and VF and SD-OCT measurements were identical to those in the training dataset. In addition to HFA 10-2 test and OCT data, all eyes in the testing dataset had also undergone HFA 24-2/30-2 VF testing. The HFA 24-2/30-2 test data were used to correct the predicted threshold sensitivity (TH) values using CNN-PBR. 
VF Testing
VF testing was performed with HFA 10-2 test using the Swedish Interactive Thresholding Algorithm standard strategy within three months from the measurement of SD-OCT. Near-refractive correction was used as necessary. All subjects had previously experienced the HFA test at least once. We excluded unreliable VFs with fixation loss ≥ 20% or false-positive responses ≥ 15%, following the recommendation by the manufacturer. HFA 24-2/30-2 tests in the testing dataset were performed within three months of the HFA 10-2 test. 
SD-OCT Measurement
RS 3000 (Nidek Co Ltd, Aichi, Japan) and OA-2000 (Tomey, Aichi, Japan) were used to obtain SD-OCT and axial length measurement data, respectively. All SD-OCT measurements were performed after pupil dilation with 1% tropicamide. We excluded data with apparent eye movement and involuntary blinking or saccade during the measurement, and imaging data with quality factor < 7, as recommended by the manufacturer. A 9.0 × 9.0 mm image was centered on the fovea. The macular thicknesses of the three parts were exported as pixel images (512 × 128): macular retinal nerve fibber layer (RNFL), ganglion cell layer + inner plexiform layer, and outer segment + retinal pigment epithelium. We resized the images in each part to 224 × 224 pixels with a bicubic interpolation21 over 4 × 4 neighborhood and resampling using pixel area relation22 to inherit the parameters from ResNet, which is one of the most popular pretrained models for image classification.23 Furthermore, data augmentation was performed via vertical flip. The full details are described in our previous study.12 
Deep Learning (CNN-PBR)
We trained the CNN-PBR model to predict threshold (TH) values of the HFA 10-2 test from the three retinal thicknesses, using the parameters of ResNet. The details of CNN-PBR are described in our previous reports.11,12 In short, CNN-PBR has an advantage of making best use of paired (VF with OCT) data over CNN without PBR, by avoiding the problem of overfitting using VF data alone. 
Correction With HFA 24-2/30-2 Test Results
To try to improve prediction performance, we further corrected the predicted TH values by using HFA 24-2/30-2 VF test results of the same eye, as detailed in Figure 1. First, the TH values of the 10-2 test at coordinates (X,Y): (3, 9), (9, 3), (3, −9), (9, −3), (−3, −9), (−9, −3), (−9, 3), and (−3, 9) were predicted using neighboring TH values, which were predicted with CNN-PBR, and the weights based on the distance to the test points. The predicted TH values of the HFA 10-2 test were averaged in each quadrant (superior nasal, inferior nasal, superior temporal, and inferior temporal). Second, the actual TH values of the HFA 24-2 test at (±3, ±3), (±3, ±9), (±9, ±3) were averaged in each quadrant. Finally, the difference between these averages was added to the predicted TH values of the 10-2 test in each quadrant. 
Figure 1.
 
Outline of the HFA 24-2/30-2 correction method. The superior nasal VF in the left eye of a representative subject is shown. First, the TH values at (3,9) and (9,3) coordinates were predicted from the neighboring TH values, which were predicted with CNN-PBR, and the weights based on the distance to the test points. For example, distances between (1,9) and (3,9), between (1,7) and (3,9), between (3,7) and (3,9), and between (5,7) and (3,9) are\(\ 2,\ 2\sqrt 2 ,2,\ 2\sqrt 2 {\rm{\ }}\), and thus the weights become \(\frac{1}{2}:\frac{1}{{2\sqrt 2 }}:\frac{1}{2}\): \(\frac{1}{{2\sqrt 2 }} = \sqrt 2 \):1:\(\sqrt 2 :1.\) Using these weights, the TH value of 10-2 at (3,9) was predicted as follows:
 
\(\frac{{30.5\ \times \sqrt 2 + 31.5\ \times \ 1 + 32.4\ \times \sqrt 2 + 32.1\ \times \ 1{\rm{\ }}}}{{\sqrt 2 + 1 + \sqrt 2 + 1}} = 31.6\)
 
Likewise, the predicted TH value at (9,3) was 30.6. We averaged the TH values at (3,9), (9,3) and (3,3):
 
\(\frac{{31.6 + 30.6 + 31.2}}{3} = 31.1\)
 
Second, we averaged the TH value of 24-2 at (3,9), (9,3), and (9.9):
 
\(\frac{{30 + 29 + 32}}{3} = 30.3\) Finally, the difference between these values (30.3–31.1 = −0.8) was added to the predicted TH values of 10-2. The corrections in the other quadrants were performed in a similar manner.
Figure 1.
 
Outline of the HFA 24-2/30-2 correction method. The superior nasal VF in the left eye of a representative subject is shown. First, the TH values at (3,9) and (9,3) coordinates were predicted from the neighboring TH values, which were predicted with CNN-PBR, and the weights based on the distance to the test points. For example, distances between (1,9) and (3,9), between (1,7) and (3,9), between (3,7) and (3,9), and between (5,7) and (3,9) are\(\ 2,\ 2\sqrt 2 ,2,\ 2\sqrt 2 {\rm{\ }}\), and thus the weights become \(\frac{1}{2}:\frac{1}{{2\sqrt 2 }}:\frac{1}{2}\): \(\frac{1}{{2\sqrt 2 }} = \sqrt 2 \):1:\(\sqrt 2 :1.\) Using these weights, the TH value of 10-2 at (3,9) was predicted as follows:
 
\(\frac{{30.5\ \times \sqrt 2 + 31.5\ \times \ 1 + 32.4\ \times \sqrt 2 + 32.1\ \times \ 1{\rm{\ }}}}{{\sqrt 2 + 1 + \sqrt 2 + 1}} = 31.6\)
 
Likewise, the predicted TH value at (9,3) was 30.6. We averaged the TH values at (3,9), (9,3) and (3,3):
 
\(\frac{{31.6 + 30.6 + 31.2}}{3} = 31.1\)
 
Second, we averaged the TH value of 24-2 at (3,9), (9,3), and (9.9):
 
\(\frac{{30 + 29 + 32}}{3} = 30.3\) Finally, the difference between these values (30.3–31.1 = −0.8) was added to the predicted TH values of 10-2. The corrections in the other quadrants were performed in a similar manner.
Statistical Analysis
We compared the prediction performance between the two models: (1) CNN-PBR alone (CNN-PBRalone) and (2) CNN-PBR with correction (CNN-PBRcorrection). First, we compared absolute error (AE) of mean TH (mTH) values for the whole VF between the two models, using a linear mixed effects model whereby patients were regarded as the random effects. Second, we compared pointwise prediction performance through mean absolute error (MAE) between the two models, using a linear mixed effects model. MAE was calculated as:  
\begin{eqnarray*}\!\!\!\begin{array}{l}MAE\\ = \frac{{\mathop \sum \nolimits_{i = 1}^{68} \left| {{\rm{predicted}}\,{\rm{visual}}\,{\rm{sensitivity}}\,{\rm{of}}\,{\rm{the}}\,i{\rm{th}}\,{\rm{point}} - {\rm{actual}}\,{\rm{visual}}\,{\rm{sensitivity}}\,{\rm{of}}\,{\rm{the}}\,i{\rm{th}}\,{\rm{point}}} \right|}}{{68}},\end{array}\end{eqnarray*}
where i = number of the 68 predicted test points. 
We illustrated MAE at each testing point to investigate the trend of the prediction error caused by spacing position. Additionally, we illustrated the signed prediction error stratified by actual sensitivity of HFA 10-2 test using a boxplot. Statistical analysis was performed with Python (version 3.7.6; Python Software Foundation) and the statistical programming language R language (version 3.6.3; R Foundation for Statistical Computing, Vienna, Austria). 
Results
The Table shows the characteristics of the training and testing datasets. Actual TH values of the HFA 10-2 test are shown in Figure 2. The mTH value in the inferior temporal quadrant (26.3 dB) was higher than in the superior temporal (20.3 dB), superior nasal (17.8 dB), and inferior nasal quadrants (22.2 dB) (linear mixed model, all P < 0.001). 
Table.
 
Characteristics of the Training and Testing Datasets
Table.
 
Characteristics of the Training and Testing Datasets
Figure 2.
 
Actual threshold values of the 10-2 VF test (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each test point are shown. Right eyes were mirror imaged.
Figure 2.
 
Actual threshold values of the 10-2 VF test (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each test point are shown. Right eyes were mirror imaged.
Significantly lower AE of mTH values were observed with CNN-PBRcorrection than with the CNN-PBRalone (1.9 dB vs. 2.6 dB; difference, −0.7; 95% confidence interval, −1.3 to −0.2; linear mixed model, P = 0.006) (Fig. 3). MAE of TH values was also significantly lower with CNN-PBRcorrection compared to the CNN-PBRalone model (4.2 dB vs. 5.3 dB; difference, −1.1; 95% confidence interval, −1.6 to −0.6; linear mixed model, P < 0.001) (Fig. 4). AE at each testing point are shown in Figure 5. In general, the AEs in the inferior temporal quadrant tended to be lower than those in other quadrants. Figure 6 shows the signed prediction error stratified by actual sensitivity of HFA 10-2. There was a trend toward more negative prediction error where actual sensitivity was high. 
Figure 3.
 
Absolute error of mean threshold values. The absolute error of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (1.9 dB vs. 2.6 dB; difference, −0.7; 95% confidence interval, −1.3 to −0.2; linear mixed model, P = 0.006).
Figure 3.
 
Absolute error of mean threshold values. The absolute error of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (1.9 dB vs. 2.6 dB; difference, −0.7; 95% confidence interval, −1.3 to −0.2; linear mixed model, P = 0.006).
Figure 4.
 
Mean absolute error of threshold values. The MAE of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (4.2 dB vs. 5.3 dB; difference, −1.1; 95% confidence interval, −1.6 to −0.6; linear mixed model, P < 0.001).
Figure 4.
 
Mean absolute error of threshold values. The MAE of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (4.2 dB vs. 5.3 dB; difference, −1.1; 95% confidence interval, −1.6 to −0.6; linear mixed model, P < 0.001).
Figure 5.
 
Pointwise absolute prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Figure 5.
 
Pointwise absolute prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Figure 6.
 
Pointwise signed prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Figure 6.
 
Pointwise signed prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Discussion
In the present study, the HFA 10-2 test was predicted from SD-OCT imaging using a DL model (CNN-PBR) further corrected using HFA 24-2/30-2 test results of the same eye. Prediction performance was significantly improved using this correction method. Prediction errors were small: 1.9 dB (AE of mTH values) and 4.2 dB (MAE of TH values); we believe these to be the smallest errors reported to date for this type of prediction model. 
Many models have been reported to discriminate between glaucoma eyes and nonglaucoma eyes using OCT.6,7,24,25 Limited models have also been developed to predict VF measurements from OCT, but predicted measurements were usually mean VF sensitivity or sectoral VF sensitivity.10,13 In this research we predicted TH values in a pointwise manner; as shown in a recent paper, accurate pointwise predictions are more difficult than sectorial prediction.11 The importance of pointwise prediction cannot be overstated when considering the application of such a model to real-world clinical settings. Given that test-retest VF variability lies between 1 and 2 dB in the central area and between 4 and 6 dB in the points at 27°,26,27 the current performance of pointwise prediction (MAE, 4.2 dB) within the central 10° area should be considered very good. 
As widely acknowledged, the association between retinal structure and function is nonlinear.4,28 Structural damage precedes functional damage in glaucoma, that is, visual sensitivity has not deteriorated until RNFL thickness reaches a critical level.29 DL models are helpful for the current prediction task whereby HFA 10-2 test sensitivity (response variable) and SD-OCT measurements (explanatory variable) are considered to be nonlinearly associated, because DL does not require any assumption, such as a linear relation between response and explanatory variables, which is generally an assumption for conventional multivariable models. 
In the current study, the AE in the inferior temporal area tended to be lower than the AE observed in other quadrants. This area corresponds to the preserved “central isle” of the VF seen in patients with advanced glaucoma,30 and the TH values observed in this region remained relatively high. This is a possible explanation for the greater prediction performance in the inferior temporal area, because OCT is more useful for predicting VF in early-to-moderate glaucoma than in advanced glaucoma.29,31 Another possible reason may be the smaller standard deviation of actual TH values in the inferior temporal area. In addition, conventional VF testing, including the Swedish Interactive Thresholding Algorithm standard, decides VF sensitivity using the bracketing method. However, this method is relatively inaccurate compared to the thresholding method via frequency-of-seeing curves. This finding is much more obvious where VF sensitivity is low. This implies there may be a limit of the prediction performance where VF sensitivity is very low. 
There are several limitations in the current study. First, predicted VF sensitivity was confined to the 10-2 test pattern rather than 24-2/30-2 test patterns because of the limited macular area captured by SD-OCT. Future wide-field OCT may solve the problem. Second, the model developed in the current study is not directly ready to be used in the clinical setting, because it has not been implemented in any medical support tools. However, it may be possible to integrate the current model into software in future. Third, the generalizability of the current findings may be limited. We used the same patients of the previous study11 because the principal aim of the current study was to compare the prediction performance between the previous model (using only OCT data) and the current model (using both OCT and HFA 24-2/30-2 data). Future studies using other external large datasets are needed to validate the current findings. Fourth, there was a possibility that myopic eyes could bias the current results. Thus we iterated the analyses separating the eyes in the testing dataset to those with axial length of <26.5 mm and >26.5 mm.32 The resulting mean of MAE was 4.1 dB and 4.5 dB, respectively, which were almost the same values as that of 4.2 dB calculated in the overall eyes. Therefore we consider that bias caused by myopic eyes would not be very large. Fifth, similar to the fourth limitation, the existence of normal-tension glaucoma could bias the current results. The proportions of normal tension glaucoma and primary open-angle glaucoma were 29% (N = 30) and 14% (N = 15), respectively, in the testing dataset. The fine classification (primary open-angle glaucoma or normal-tension glaucoma (NTG)) was unknown in the remaining eyes, because these patients were already under treatment when referred to our hospitals. The mean of MAE was 4.47 and 4.52 dB in eyes with normal-tension glaucoma and primary open glaucoma, respectively; these values were almost identical to that in the overall eyes (4.2 dB). However, future studies with larger sample size would be needed shedding light on this issue, because VF damages of NTG have been reported to be different from those in primary open-angle glaucoma. Sixth, by our definition, HFA 24-2/30-2 could have been done more than three months (maximum of six months) apart from the OCT measurement. However, all of the examined eyes had an interval of ≤3 months from OCT to HFA 24-2/30-2, except for one eye, so this would have only a negligible effect on the obtained results.33 
In conclusion, the current DL model (CNN-PBR), with correction based on 24-2/30-2 test results, demonstrated better prediction performance than the CNN-PBR model alone. Software/support tools equipped with this methodology, which we need to develop in the future, would be beneficial in the clinical setting. 
Acknowledgments
Supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan grants 18KK0253, 19H01114, 17K11418, 25861618, 20768254, and 00768351; the Translational Research program; Strategic Promotion for practical application of Innovative medical Technology (TR-SPRINT) from Japan Agency for Medical Research and Development (AMED) ; JST-AIP JPMJCR19U4. 
Disclosure: Y. Hashimoto, None; T. Kiwaki, None; H. Sugiura, None; S. Asano, None; H. Murata, None; Y. Fujino, None; M. Matsuura, None; A. Miki, None; K. Mori, None; Y. Ikeda, None; T. Kanamoto, None; J. Yamagami, None; K. Inoue, None; M. Tanito, None; K. Yamanishi, None; R. Asaoka, None 
References
Jonas JB, Aung T, Bourne RR, et al. Glaucoma. Lancet. 2017; 390: 2183–2193. [CrossRef] [PubMed]
Harwerth RS, Carter-Dawson L, Smith EL, et al. Neural Losses Correlated with Visual Losses in Clinical Perimetry. Invest Opthalmol Vis Sci. 2004; 45: 3152. [CrossRef]
Kerrigan LA, Quigley HA, Pease ME, et al. Number of ganglion cells in glaucoma eyes compared with threshold visual field tests in the same persons. Invest Opthalmol Vis Sci. 2000; 41: 8.
Leite MT, Zangwill LM, Weinreb RN, et al. Structure-function relationships using the cirrus spectral domain optical coherence tomograph and standard automated perimetry. J Glaucoma. 2012; 21: 49–54. [CrossRef] [PubMed]
Hood DC. Improving our understanding, and detection, of glaucomatous damage: an approach based upon optical coherence tomography (OCT). Prog Retin Eye Res. 2017; 57: 46–75. [CrossRef] [PubMed]
Burgansky-Eliash Z, Wollstein G, Chu T, et al. Optical coherence tomography machine learning classifiers for glaucoma detection: a preliminary study. Invest Opthalmol Vis Sci. 2005; 46: 4147. [CrossRef]
Asaoka R, Hirasawa K, Iwase A, et al. Validating the usefulness of the “random forests” classifier to diagnose early glaucoma with optical coherence tomography. Am J Ophthalmol. 2017; 174: 95–103. [CrossRef] [PubMed]
Phene S, Dunn RC, Hammel N, et al. Deep learning and glaucoma specialists: The relative importance of optic disc features to predict glaucoma referral in fundus photographs. Ophthalmology. 2019; 126: 1627–1639. [CrossRef] [PubMed]
Asaoka R, Murata H, Hirasawa K, et al. Using deep learning and transfer learning to accurately diagnose early-onset glaucoma from macular optical coherence tomography images. Am J Ophthalmol. 2019; 198: 136–145. [CrossRef] [PubMed]
Park K, Kim J, Kim S, Shin J. Prediction of visual field from swept-source optical coherence tomography using deep learning algorithms. Graefes Arch Clin Exp Ophthalmol. 2020; 258: 2489–2499. [CrossRef] [PubMed]
Hashimoto Y, Asaoka R, Kiwaki T, et al. Deep learning model to predict visual field in central 10° from optical coherence tomography measurement in glaucoma. Br J Ophthalmol. 2020; 105: 507–513. [CrossRef] [PubMed]
Sugiura H, Kiwaki T, Yousefi S, et al. Estimating Glaucomatous Visual Sensitivity from Retinal Thickness with Pattern-Based Regularization and Visualization. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’18. London: ACM Press; 2018: 783–792.
Christopher M, Bowd C, Belghith A, et al. Deep learning approaches predict glaucomatous visual field damage from optical coherence tomography optic nerve head enface images and retinal nerve fiber layer thickness maps. Ophthalmology. 2020; 127: 346–356. [CrossRef] [PubMed]
Hemelings R, Elen B, Breda JB, et al. Pointwise visual field estimation from optical coherence tomography in glaucoma: a structure-function analysis using deep learning. ArXiv210603793 Cs Eess. 2021, http://arxiv.org/abs/2106.03793; Accessed September 12, 2021.
Hood DC, Raza AS, de Moraes CGV, et al. Glaucomatous damage of the macula. Prog Retin Eye Res. 2013; 32: 1–21. [CrossRef] [PubMed]
Khoury JM, Donahue SP, Lavin PJ, Tsai JC. Comparison of 24-2 and 30-2 perimetry in glaucomatous and nonglaucomatous optic neuropathies. J Neuroophthalmol. 1999; 19: 100–108. [CrossRef] [PubMed]
Drance SM. The glaucomatous visual field. Br J Ophthalmol. 1972; 56: 186–200. [CrossRef] [PubMed]
Rao HL, Babu JG, Addepalli UK, et al. Retinal nerve fiber layer and macular inner retina measurements by spectral domain optical coherence tomograph in Indian eyes with early glaucoma. Eye. 2012; 26: 133–139. [CrossRef] [PubMed]
Crabb DP, Russell RA, Malik R, et al. Frequency of visual field testing when monitoring patients newly diagnosed with glaucoma: mixed methods and modelling. Health Serv Deliv Res. 2014; 2: 1–102. [CrossRef]
Asano S, Asaoka R, Murata H, et al. Predicting the central 10 degrees visual field in glaucoma by applying a deep learning algorithm to optical coherence tomography images. Sci Rep. 2021; 11: 2214. [CrossRef] [PubMed]
Das V. A novel diagnostic information based framework for super-resolution of retinal fundus images. Comput Med Imaging Graph. 2019; 72: 22–33. [CrossRef] [PubMed]
Thévenaz P, Blu T, Unser M Image interpolation and resampling. Handbook of medical imaging, processing and analysis. 2000; 1: 393–3420.
He K, Zhang X, Ren S, Sun J Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE; 2016: 770–778.
Mwanza J-C, Warren JL, Budenz DL Combining spectral domain optical coherence tomography structural parameters for the diagnosis of glaucoma with early visual field loss. Invest Opthalmol Vis Sci. 2013; 54: 8393. [CrossRef]
Baskaran M, Ong E-L, Li J-L, et al. Classification algorithms enhance the discrimination of glaucoma from normal eyes using high-definition optical coherence tomography. Invest Opthalmol Vis Sci. 2012; 53: 2314. [CrossRef]
Parrish RK, II, Schiffman J, Anderson DR Static and kinetic visual field testing: reproducibility in normal volunteers. Arch Ophthalmol. 1984; 102: 1497–1502. [CrossRef] [PubMed]
Heijl A, Lindgren G, Olsson J Normal variability of static perimetric threshold values across the central visual field. Arch Ophthalmol. 1987; 105: 1544–1549. [CrossRef] [PubMed]
Altangerel U, Spaeth GL, Rhee DJ Visual function, disability, and psychological impact of glaucoma. Curr Opin Ophthalmol. 2003; 14: 100–105. [CrossRef] [PubMed]
Hood DC, Kardon RH. A framework for comparing structural and functional measures of glaucomatous damage. Prog Retin Eye Res. 2007; 26: 688–710. [CrossRef] [PubMed]
Weber J, Schultze T, Ulrich H The visual field in advanced glaucoma. Int Ophthalmol. 1989; 13: 47–50. [CrossRef] [PubMed]
Swanson WH, Felius J, Pan F Perimetric defects and ganglion cell damage: Interpreting linear relations using a two-stage neural model. Invest Opthalmol Vis Sci. 2004; 45: 466. [CrossRef]
Ohno-Matsui K, Lai TYY, Lai C-C, Cheung CMG Updates of pathologic myopia. Prog Retin Eye Res. 2016; 52: 156–187. [CrossRef] [PubMed]
Thonginnetra O, Greenstein VC, Chu D, et al. Normal versus high tension glaucoma: a comparison of functional and structural defects. J Glaucoma. 2010; 19: 151–157. [CrossRef] [PubMed]
Figure 1.
 
Outline of the HFA 24-2/30-2 correction method. The superior nasal VF in the left eye of a representative subject is shown. First, the TH values at (3,9) and (9,3) coordinates were predicted from the neighboring TH values, which were predicted with CNN-PBR, and the weights based on the distance to the test points. For example, distances between (1,9) and (3,9), between (1,7) and (3,9), between (3,7) and (3,9), and between (5,7) and (3,9) are\(\ 2,\ 2\sqrt 2 ,2,\ 2\sqrt 2 {\rm{\ }}\), and thus the weights become \(\frac{1}{2}:\frac{1}{{2\sqrt 2 }}:\frac{1}{2}\): \(\frac{1}{{2\sqrt 2 }} = \sqrt 2 \):1:\(\sqrt 2 :1.\) Using these weights, the TH value of 10-2 at (3,9) was predicted as follows:
 
\(\frac{{30.5\ \times \sqrt 2 + 31.5\ \times \ 1 + 32.4\ \times \sqrt 2 + 32.1\ \times \ 1{\rm{\ }}}}{{\sqrt 2 + 1 + \sqrt 2 + 1}} = 31.6\)
 
Likewise, the predicted TH value at (9,3) was 30.6. We averaged the TH values at (3,9), (9,3) and (3,3):
 
\(\frac{{31.6 + 30.6 + 31.2}}{3} = 31.1\)
 
Second, we averaged the TH value of 24-2 at (3,9), (9,3), and (9.9):
 
\(\frac{{30 + 29 + 32}}{3} = 30.3\) Finally, the difference between these values (30.3–31.1 = −0.8) was added to the predicted TH values of 10-2. The corrections in the other quadrants were performed in a similar manner.
Figure 1.
 
Outline of the HFA 24-2/30-2 correction method. The superior nasal VF in the left eye of a representative subject is shown. First, the TH values at (3,9) and (9,3) coordinates were predicted from the neighboring TH values, which were predicted with CNN-PBR, and the weights based on the distance to the test points. For example, distances between (1,9) and (3,9), between (1,7) and (3,9), between (3,7) and (3,9), and between (5,7) and (3,9) are\(\ 2,\ 2\sqrt 2 ,2,\ 2\sqrt 2 {\rm{\ }}\), and thus the weights become \(\frac{1}{2}:\frac{1}{{2\sqrt 2 }}:\frac{1}{2}\): \(\frac{1}{{2\sqrt 2 }} = \sqrt 2 \):1:\(\sqrt 2 :1.\) Using these weights, the TH value of 10-2 at (3,9) was predicted as follows:
 
\(\frac{{30.5\ \times \sqrt 2 + 31.5\ \times \ 1 + 32.4\ \times \sqrt 2 + 32.1\ \times \ 1{\rm{\ }}}}{{\sqrt 2 + 1 + \sqrt 2 + 1}} = 31.6\)
 
Likewise, the predicted TH value at (9,3) was 30.6. We averaged the TH values at (3,9), (9,3) and (3,3):
 
\(\frac{{31.6 + 30.6 + 31.2}}{3} = 31.1\)
 
Second, we averaged the TH value of 24-2 at (3,9), (9,3), and (9.9):
 
\(\frac{{30 + 29 + 32}}{3} = 30.3\) Finally, the difference between these values (30.3–31.1 = −0.8) was added to the predicted TH values of 10-2. The corrections in the other quadrants were performed in a similar manner.
Figure 2.
 
Actual threshold values of the 10-2 VF test (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each test point are shown. Right eyes were mirror imaged.
Figure 2.
 
Actual threshold values of the 10-2 VF test (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each test point are shown. Right eyes were mirror imaged.
Figure 3.
 
Absolute error of mean threshold values. The absolute error of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (1.9 dB vs. 2.6 dB; difference, −0.7; 95% confidence interval, −1.3 to −0.2; linear mixed model, P = 0.006).
Figure 3.
 
Absolute error of mean threshold values. The absolute error of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (1.9 dB vs. 2.6 dB; difference, −0.7; 95% confidence interval, −1.3 to −0.2; linear mixed model, P = 0.006).
Figure 4.
 
Mean absolute error of threshold values. The MAE of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (4.2 dB vs. 5.3 dB; difference, −1.1; 95% confidence interval, −1.6 to −0.6; linear mixed model, P < 0.001).
Figure 4.
 
Mean absolute error of threshold values. The MAE of CNN-PBR corrected with Humphrey field analyzer 24-2/30-2 test results was significantly lower than the same model without correction (4.2 dB vs. 5.3 dB; difference, −1.1; 95% confidence interval, −1.6 to −0.6; linear mixed model, P < 0.001).
Figure 5.
 
Pointwise absolute prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Figure 5.
 
Pointwise absolute prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Figure 6.
 
Pointwise signed prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Figure 6.
 
Pointwise signed prediction error (left eye). Mean (upper row) and standard deviation (lower row) values of all eyes at each predicted 10-2 VF test point are shown. Right eyes were mirror imaged.
Table.
 
Characteristics of the Training and Testing Datasets
Table.
 
Characteristics of the Training and Testing Datasets
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×