May 2024
Volume 13, Issue 5
Open Access
Artificial Intelligence  |   May 2024
Keratoconus Progression Determined at the First Visit: A Deep Learning Approach With Fusion of Imaging and Numerical Clinical Data
Author Affiliations & Notes
  • Lennart M. Hartmann
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Denna S. Langhans
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Veronika Eggarter
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Tim J. Freisenich
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Anna Hillenmayer
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Susanna F. König
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Efstathios Vounotrypidis
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Armin Wolf
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Christian M. Wertheimer
    Department of Ophthalmology, University Hospital Ulm, Ulm, Germany
  • Correspondence: Lennart M. Hartmann, Department of Ophthalmology, Ulm University, Prittwitzstrasse 43, Ulm 89075, Germany. e-mail: lennart.hartmann@uniklinik-ulm.de 
Translational Vision Science & Technology May 2024, Vol.13, 7. doi:https://doi.org/10.1167/tvst.13.5.7
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Lennart M. Hartmann, Denna S. Langhans, Veronika Eggarter, Tim J. Freisenich, Anna Hillenmayer, Susanna F. König, Efstathios Vounotrypidis, Armin Wolf, Christian M. Wertheimer; Keratoconus Progression Determined at the First Visit: A Deep Learning Approach With Fusion of Imaging and Numerical Clinical Data. Trans. Vis. Sci. Tech. 2024;13(5):7. https://doi.org/10.1167/tvst.13.5.7.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: Multiple clinical visits are necessary to determine progression of keratoconus before offering corneal cross-linking. The purpose of this study was to develop a neural network that can potentially predict progression during the initial visit using tomography images and other clinical risk factors.

Methods: The neural network's development depended on data from 570 keratoconus eyes. During the initial visit, numerical risk factors and posterior elevation maps from Scheimpflug imaging were collected. Increase of steepest keratometry of 1 diopter during follow-up was used as the progression criterion. The data were partitioned into training, validation, and test sets. The first two were used for training, and the latter for performance statistics. The impact of individual risk factors and images was assessed using ablation studies and class activation maps.

Results: The most accurate prediction of progression during the initial visit was obtained by using a combination of MobileNet and a multilayer perceptron with an accuracy of 0.83. Using numerical risk factors alone resulted in an accuracy of 0.82. The use of only images had an accuracy of 0.77. The most influential risk factors in the ablation study were age and posterior elevation. The greatest activation in the class activation maps was seen at the highest posterior elevation where there was significant deviation from the best fit sphere.

Conclusions: The neural network has exhibited good performance in predicting potential future progression during the initial visit.

Translational Relevance: The developed neural network could be of clinical significance for keratoconus patients by identifying individuals at risk of progression.

Introduction
Keratoconus is a primary corneal ectasia,1 manifesting as visual decline caused by thinning at the center of the cornea, along with increasing corneal curvature and higher-order aberrations. Keratoconus is a chronic condition that typically exhibits periods of progression followed by stabilization, depending on various factors.2 The progression of keratoconus is usually defined as an increase in the dioptric power of the anterior corneal optic interface.3 Corneal collagen cross-linking is a proven treatment for patients with progression with a suitable risk–benefit balance.4 
Progression confirmation is necessary prior to cross-linking and requires regular reevaluation of patients.5 However, surgical intervention is only required for a small fraction of patients, because most clinical check-ups do not indicate progression. Developing a predictive method to identify patients at risk of keratoconus progression during their initial presentation could minimize resource use and lead to earlier intervention for at-risk individuals. 
Several risk factors, such as age, comorbidities, and corneal curvature, have been identified as influencing the risk of progression.6 Most of those factors are determined by corneal curvature maps using corneal tomography over several consecutive visits. It can be challenging to determine progression from this complex combination of data, especially after only one visit, which may lead to errors. 
Deep learning in artificial neural networks has piqued interest owing to its ability to discover underlying patterns in large nonlinear datasets.7,8 Computer vision-based convolutional neural networks have been used to detect variations in images of keratoconus patients compared with healthy control eyes.9 Preliminary research has explored progression recognition using only imaging.10,11 
Using solely images to assess the progression of keratoconus through machine learning may yield a less precise outcome when compared with using images in conjunction with other risk factors. For instance, it is widely recognized that younger age is strongly linked to keratoconus progression.10 The combination of multiple data types in a single neural network necessitates a more complex network architecture, which we attempted to test in this study to predict a future increase in steepest keratometry (Kmax) at the patient's initial clinical examination. In addition, a careful ablation study and class activation maps were used to determine the impact of individual risk factors and images on the overall result. 
Methods
Study Design
This retrospective study included 570 keratoconic eyes. All patients were seen as part of standard routine visits at the Keratoconus Clinic of the Department of Ophthalmology at University Hospital Ulm from January 2016 to October 2022. This study was approved by the ethics committee of Ulm University (ethical approval ID: 332/22). 
Inclusion Criteria and Follow-up
Patients were referred to University Eye Hospital Ulm for keratoconus diagnosis. The cornea specialist diagnosed all patients with keratoconus after a complete ophthalmic exam and corneal tomography using Scheimpflug imaging (Pentacam, Oculus, Wetzlar, Germany). Follow-up appointments were advised every 3 to 12 months subsequent to individual risk assessment for disease progression. As a standard practice, patients were informed to discontinue use of contact lenses 2 weeks before each appointment. 
Exclusion Criteria
All patients with corneal ectasia, except for those with keratoconus, were excluded from the study. Additionally, patients who had undergone previous surgical intervention for ocular comorbidities such as cross-linking, cataract surgery, keratoplasty, and refractive laser surgery, which may have impacted keratometry, were excluded. 
Progression Definition
There is currently no consistent or clear definition of keratoconus progression. In this study, progression was defined as an increase of more than 1 diopter (D) in the anterior Kmax during the clinical course; otherwise, the patient was considered as a nonprogressive keratoconus. The dynamic change of Kmax during the follow-up was used to label as progressive in case of increase of Kmax of 1 D or stable. 
Data Quality Control
The data were binary labeled as progressive or stable. Before inputting the data into the neural network, an automated algorithm calculated the progression from Kmax values. Then, two cornea specialists reevaluated all images and tabular data acquired during the clinical routine to ensure their consistency. The diagnosis and progression were then reconfirmed. Any disagreements were resolved by involving another specialist who favored a two-to-one vote. Of the original 1293 eyes, 570 were deemed eligible for this study after applying stringent exclusion criteria and quality control. 
Image Processing and Data Preparation
An algorithm for batch preprocessing of tomography images was developed by CMW using the Pycharm IDE (2021.3.1, JetBrains, Prague, Czech Republic) with Python 3.9 and the following libraries: pillow, numpy, csv, pandas, and regex. The Pentacam four-map refraction display within a 9-mm radius was exported as a portable network graphics file. The color scale used for the export was an Oculus color map with a relative scale in 2.5 microns and 61 colors. The posterior elevation map was cropped, scaled, and surrounded by black. The following numerical data were recorded: age, Kmax, pachymetry, sex, K1, K2, and maximum posterior elevation. The table was transformed into a comma separated value file and used as input for the neural network (Fig. 1). 
Figure 1.
 
Image and clinical risk factor data were preprocessed and used as input to the neural network. First, Scheimpflug images and numerical tabular data were processed separately in two networks. Then, they were concatenated and processed in a multilayer perceptron. Finally, performance was evaluated using a test dataset that the network had not seen before.
Figure 1.
 
Image and clinical risk factor data were preprocessed and used as input to the neural network. First, Scheimpflug images and numerical tabular data were processed separately in two networks. Then, they were concatenated and processed in a multilayer perceptron. Finally, performance was evaluated using a test dataset that the network had not seen before.
Architecture of Neural Network
Before training the neural network, the entire dataset, consisting of both Scheimpflug images and clinical data (age, Kmax, etc.), was randomly divided into three subsets: a training set of 80% of the data, a validation set of 10%, and a test set of 10%. A test and validation set were used specifically to avoid poor generalization to unseen data. Our neural network consists of two distinct parts (Fig. 1). The Scheimpflug images were trained on several published convolutional networks, including AlexNet, Darknet-53, EfficientNet, GoogLeNet, MLPMixer, MobileNet, NIN, ResNet18, SqueezeNet, and Xception. In addition, we used a self-designed very small network with three convolutional layers. The clinical data, such as age and Kmax, were processed using another type of neural network, a multilayer perceptron. The multilayer perceptron and the corresponding neural network processing the Scheimpflug images were connected by a concatenation layer, and their output was fed into another multilayer perceptron, which then produced the final output. The independent test dataset, which had not yet been encountered by the neural network, was then used to assess the accuracy, specificity, and sensitivity of the network in predicting keratoconus progression. A more detailed explanation of the specific coding can be found in Supplementary File 1
Receiver Operating Characteristic Curve
Receiver operating characteristic curves were generated by extracting the probabilities for each class of the previous unknown test dataset. The values of sensitivity and specificity were plotted using GraphPad Prism 10 (GraphPad Software, San Diego CA, USA). In this study, receiver operating characteristic curves were generated using images only, clinical risk factors only, and a combination of images and clinical risk factors. 
Ablation Study of Numerical Data
An ablation study describes the process of selectively removing single parameters of a neural network, testing its performance without the respective removed parameter, and then comparing the different performances to determine the importance of the single parameters. In this study, we conducted an ablation study of the numerical data by testing seven clinical risk factors in this way, resulting in iterating through 127 possible combinations without repetitions in the multilayer perceptron by adjusting the neurons of the input layer accordingly. To determine the most suitable hyperparameters, a grid search similar to the one mentioned elsewhere in this article was used. The network's function was determined by the achieved maximum accuracy when specific features were absent or present. 
Class Activation Maps for Images
To gain insight into the imaging features that differentiate progression from no progression, we used class activation maps to identify channels with increased activation for a given prediction. Our approach involved using ResNet18 and the test dataset, as well as the weights of the final convolutional layer from the best run with images only. Note that these methods only considered images and did not account for clinical risk factors. 
Statistical Analyses
Excel 365 (Microsoft, Redmond, WA, USA), Pycharm IDE with csv library (2021.3.1 JetBrains, Prague, Czech Republic), and SPSS 29 (IBM, Armonk, New York, USA) were used for data processing and statistical analysis. Statistical comparisons were performed using either a Fisher's exact test or Student's t test. A P value of less than 0.05 was considered significant. In this study, sensitivity is defined as the true positive rate or the rate of the correctly detected progressive cases divided by all progressive cases. Analogous to this, specificity represents the rate of correctly detected nonprogressive cases divided by all nonprogressive cases. 
Results
Patient Characteristics
We included 570 eyes with keratoconus, of which 161 showed progression. There were 303 patients (74%) without progression who were male, compared with male patients 124 (77%) with progression (P = 0.02). The mean age of patients with progression was 27 ± 11 years compared with 32 ± 13 years for patients without progression (P < 0.001). The mean Kmax at first presentation was 54 ± 6 D in progressive cases compared with 51 ± 7 D in nonprogressive cases (P < 0.001). In patients with keratoconus progression, the mean increase of Kmax was 3.3 ± 4.8 D, compared with −0.1 ± 1.1 D in patients without keratoconus progression (P < 0.0001). The mean follow-up was 21 ± 17 months for nonprogressive cases and 23 ± 19 months for progressive cases. The severity of the keratoconus was evaluated using the modified Amsler–Krumeich classification. Among the nonprogressive keratoconus group, 321 patients (78%) were classified as having stage 1 disease compared with 117 patients (73%) in the progressive group (P = 0.14). Stage 2 disease was observed in 66 nonprogressive cases (16%) and 30 progressive cases (19%) (P = 0.47). Stage 3 disease was observed in 12 nonprogressive cases (3%) and 7 in the progressive group (4%) (P = 0.40). There were 10 patients (2%) with nonprogressive keratoconus at stage 4, compared with 7 cases (4%) with progressive disease (P = 0.23). 
Performance in Predicting Keratoconus Progression
With the goal of predicting keratoconus progression at the first visit, in terms of future Kmax progression, we developed a neural network that showed promising accuracy using image and clinical data alone or in combination. We modified the convolutional network for deep learning from images and kept the network for numerical tabular data constant. The MobileNet convolutional network in combination with the multilayer perceptron yielded the highest accuracy of 0.83 after 5000 epochs at a learning rate of 0.001 and with a batch size of 8. Additionally, this strategy achieved an acceptable sensitivity of 0.53 and a high specificity of 0.95 (Table and Fig. 2). Although only one-half of all progressive cases could be correctly detected, 95% of all nonprogressive cases were identified. However, an accuracy of 0.83 means that overall, approximately four out of five patients of our dataset could correctly be assigned to progressive or nonprogressive disease. This performance level was comparable with the maximum accuracy of 0.82 achieved exclusively with tabular data alone. These data were achieved at a learning rate of 0.001, a batch size of 32, and after 3000 epochs. The tabular data also showed a sensitivity of 0.40 and a specificity of 1.0. It was only feasible when age, pachymetry, K1, K2, and maximum posterior elevation were used for training and testing, while sex and Kmax were excluded. The use of images solely resulted in a reduced accuracy of 0.77 and specificity of 0.93, along with a sensitivity of 0.41 in a small three-layer convolutional network that we constructed ourselves (Table). 
Table.
 
A Summary of all Neural Network Designs Is Provided, Including the Maximum Performance Measurements Achieved
Table.
 
A Summary of all Neural Network Designs Is Provided, Including the Maximum Performance Measurements Achieved
Figure 2.
 
Receiver operating characteristic (ROC) curves for our neural network are displayed. The ROC curve shown represent performances achieved with images only (red), clinical risk factors only (blue), and images and clinical risk factors combined (green).
Figure 2.
 
Receiver operating characteristic (ROC) curves for our neural network are displayed. The ROC curve shown represent performances achieved with images only (red), clinical risk factors only (blue), and images and clinical risk factors combined (green).
Ablation Study
In our study, the clinical risk factors (0.82) achieved similar maximum accuracy as the combined use of risk factors and images (0.83). In contrast, images alone did not produce the same level of accuracy (0.77). After this finding, we investigated further the critical role of risk factors. We trained and tested a multilayer perceptron using all 127 possible combinations of the seven clinical risk factors in an ablation study. The mean and median accuracy of the runs that had age, Kmax, and posterior elevation included were higher than the mean accuracy of all runs. Conversely, the mean accuracy was lower when these factors were excluded (Fig. 3). The results indicate that a sole factor is inadequate for the network to predict progression. However, when two (age and posterior elevation) are combined, they achieve an accuracy of 0.79. The inclusion of Kmax and other factors can further boost accuracy. Remarkably, excluding sex results in the two highest accuracies (Fig. 4). 
Figure 3.
 
Mean (blue cross) and median (red line) accuracies for all 127 possible combinations of the seven clinical risk factors in the ablation study are presented. The first box blot represents all possible combinations, while columns two through eight display the mean and median accuracies for combinations that either included (A) or excluded (B) a given factor. Results show that the highest mean accuracy was achieved in combinations that included age, Kmax, and posterior elevation. Omitting any one of these three factors from the combination led to lower accuracy as compared with the mean and median accuracy of all other groups.
Figure 3.
 
Mean (blue cross) and median (red line) accuracies for all 127 possible combinations of the seven clinical risk factors in the ablation study are presented. The first box blot represents all possible combinations, while columns two through eight display the mean and median accuracies for combinations that either included (A) or excluded (B) a given factor. Results show that the highest mean accuracy was achieved in combinations that included age, Kmax, and posterior elevation. Omitting any one of these three factors from the combination led to lower accuracy as compared with the mean and median accuracy of all other groups.
Figure 4.
 
The contribution of the clinical risk factors is presented in a matrix plot. A multilayer perceptron was trained and tested in the ablation study, using all 127 possible combinations of the seven clinical risk factors. The y-axis displays the various combinations with each row representing one combination, while the x-axis displays the different risk factors with each column representing one (age = 1, Kmax = 2, pachymetry = 3, sex = 4, K1 = 5, K2 = 6, posterior elevation = 7). If a feature is excluded from a particular combination, it is displayed in black. The results for the maximum accuracy are color coded for each row, as shown on the left. It is evident that relying on a single factor is insufficient for the network to predict progression. However, combining only age and posterior elevation achieves a 0.79 accuracy. The addition of Kmax and other factors can significantly improve accuracy. It is noteworthy that excluding sex yields the two highest accuracies.
Figure 4.
 
The contribution of the clinical risk factors is presented in a matrix plot. A multilayer perceptron was trained and tested in the ablation study, using all 127 possible combinations of the seven clinical risk factors. The y-axis displays the various combinations with each row representing one combination, while the x-axis displays the different risk factors with each column representing one (age = 1, Kmax = 2, pachymetry = 3, sex = 4, K1 = 5, K2 = 6, posterior elevation = 7). If a feature is excluded from a particular combination, it is displayed in black. The results for the maximum accuracy are color coded for each row, as shown on the left. It is evident that relying on a single factor is insufficient for the network to predict progression. However, combining only age and posterior elevation achieves a 0.79 accuracy. The addition of Kmax and other factors can significantly improve accuracy. It is noteworthy that excluding sex yields the two highest accuracies.
Class Activation Maps
To identify discriminative regions in the posterior elevation maps of the Pentacam responsible for discriminating between the stable and progression classes, we used gradient weighted class activation maps. Within this method, the influence of recognized patterns on the predicted outcome is calculated and applied to the input image, providing a visual explanation of those regions pivotal for the behavior of the network. The highest signals were observed at the highest posterior elevation in both groups, but slightly enlarged toward the superior region. It is possible that the deviation from the best fit sphere of the irregular keratoconic cornea in this particular superior area causes a significant alteration in color in the Pentacam images, indicating more advanced keratoconus (see Fig. 5). 
Figure 5.
 
Four representative images of posterior elevation maps overlayed with the specific class activation maps. Two of the keratoconus samples were stable (A and B), while two were progressive (C and D). Green, yellow, and red colors indicate channels with higher activations, whereas blue denotes low activation. The results reveal that the convolutional neural network is triggered by point of highest posterior elevation in both groups, albeit slightly expanded to the superior region. The deviation from the best fit sphere in this area could cause a noticeable change in the color of the elevation maps, suggesting the presence of more advanced keratoconus.
Figure 5.
 
Four representative images of posterior elevation maps overlayed with the specific class activation maps. Two of the keratoconus samples were stable (A and B), while two were progressive (C and D). Green, yellow, and red colors indicate channels with higher activations, whereas blue denotes low activation. The results reveal that the convolutional neural network is triggered by point of highest posterior elevation in both groups, albeit slightly expanded to the superior region. The deviation from the best fit sphere in this area could cause a noticeable change in the color of the elevation maps, suggesting the presence of more advanced keratoconus.
Discussion
In this study, we introduce a novel method for predicting keratoconus progression during the initial clinical examination by using various neural network architectures. The models were trained on a posterior best fit sphere elevation ,map as well as several potential risk factors. After systematically validating multiple network designs and systematically evaluating hyperparameters for a total of approximately 2 × 107 training epochs in 570 eyes, we attained a maximum accuracy of 0.83. Our findings reveal an acceptable sensitivity of 0.53 and a high specificity of 0.95 in a previously unknown test dataset. 
Although artificial intelligence has been researched extensively and proven to exhibit high performance in accurately detecting the presence of keratoconus while being sensitive and specific,9 only a limited number of studies have explored the potential of deep learning techniques in predicting the progression of the disease. In a research paper analyzing 218 eyes, a neural network trained with optical coherence tomography images of the anterior segment demonstrated a 79% accuracy rate in predicting the progression of keratoconus.10 The accuracy rate was improved to 85% when age was factored in using a complicated decision tree. Our study incorporated clinical risk data and images in a fully machine learning–based approach. Another study, which included 274 eyes, achieved 81% accuracy by combining corneal tomography and patient age.12 This study, along with our own, demonstrates that the progression of keratoconus, in terms of the increase of Kmax, can be predicted to some extent from a single outpatient visit. 
Unfortunately, determining which parameter or combination of parameters the neural network relied on most to determine the risk of keratoconus progression is not possible, given the black box nature of neural networks. However, an ablation study was conducted to gain a better understanding. The accuracy of the neural network depends on clinical risk factors, with images contributing minimally to the final result. Using only images resulted in significantly lower accuracy compared with using both clinical risk factors and images, or clinical risk factors alone. Concerning clinical risk factors, the inclusion of age, Kmax, and posterior elevation led to higher accuracies. This finding is in line with existing literature, because age and Kmax are numerical risk factors that increase the likelihood of progression in younger patients and those with a Kmax of steeper than 55 D.6 Additionally, the posterior elevation is a more sensitive indicator than the anterior corneal surface because it is not affected by epithelial remodeling, which could potentially mask the extent of the ectatic disease.11 This observation aligns with the class activation maps of the images in ResNet18, which show that channel activation occurs around the highest posterior elevation in both groups. Furthermore, there is slight expansion toward the superior region, where a high deviation from the best fit sphere and a strong change in elevation map colors exists, which may suggest a more advanced keratoconus stage, along with other potential but yet to be determined alternative explanations. It is important to consider that the posterior elevation map may be affected by increased light scattering caused by corneal opacities. Incorporating these images, which are affected by measurement noise, could have a negative impact on the performance of our model. To address this issue, it may be beneficial to use keratometry maps derived from optical coherence tomography with a higher wavelength. 
Our findings suggest that a combination of numerical clinical data and images yields benefits. However, clinical risk factors alone may be sufficient for predicting progression. It should be highlighted that using only numerical clinical risk data in a deep learning approach requires significantly fewer computational resources than using large tensors of images. Unfortunately, several risk factors were absent from our model because they were not recorded consistently in our retrospective clinical data, including biological profile, eye rubbing, atopic eczema, genetics, and family history.1 The severity of the keratoconus could also be investigated. Multicollinearity among the factors included in the neural network must be considered and verified. It should be emphasized that our study is still valuable in identifying important predictors of disease development. Future studies will need to determine whether the inclusion of these factors would lead to better risk stratification, which could potentially provide a method avoiding computationally intensive image-based processes. 
Predicting future outcomes through complex multifactorial associations using deep learning remains a widely researched topic given its ability to detect patterns in vast amounts of unstructured data.7 However, there are challenges in using these methods, as evaluated in other fields, that also arise in the task of accurately predicting the progression of keratoconus. Predicting stock prices is a complex issue with numerous contributing factors that can result in unstable and difficult predictions.13 One of the problems is the intricacy of the market, and the fact that not all factors can be quantified precisely.14 This may also hold true for keratoconus because genetic testing, for instance, is not regularly accessible as a risk predictor. Historical or recent data are used to forecast future outcomes, which can be impacted by unpredictable or unknown factors. Furthermore, the interpretation of data may be affected by excessive noise.16 For instance, in cases of keratoconus, a stable cornea may show progression owing to a unpredicted pregnancy in the near future.15 Our research resulted in a success rate of 83% in accurately predicting progression of keratoconus. The accuracy of the prediction is unlikely to reach that of a classical deep learning classification problem, although this number may improve with larger datasets. 
Progression has not been clearly defined yet; thus, the parameters used to determine it lack consistency worldwide.5 Kmax is the most commonly used parameter to detect or document progression, and is regularly used to decide for cross-linking among other clinical factors.3 Because it is objective and easy to determine, we have chosen a 1-D increase in Kmax as our indicator for defining progression. However, other factors in conjunction with this indicator should be considered. Kmax has been criticized as an imprecise measure for its intended use, because it only includes a small area and the anterior corneal surface.5 Additionally, progression can happen even with stable anterior surface measurements, particularly in the early stages of keratoconus.17,18 In the future, if other parameters for progression are defined or used, we can relabel our data and train a new neural network. 
In addition to the technical aspects of our work, it is crucial to carefully consider socioethical and other scientific factors before implementing the proposed neural network in a clinical setting. If the progression and recommendation of corneal cross-linking are determined by artificial intelligence, caution should be exercised when deciding for the procedure owing to the potential for rare yet severe surgical complications.19 The scientific problems to be solved are to distinguish the advantages of the newly proposed method in comparison with the current practice of regular outpatient follow-up. This process involves to decide for an accuracy of the neural network that is required for this application. Socioethical concerns involve examining whether bias from the software, dataset, developer, physician, and study design collectively impact both the model's training and outcomes. The lack of transparency regarding the processing of data in the deep neural network and the factors involved in decision-making contributes to this issue.20 In a notable example, the application of deep learning for melanoma detection, was found to be more challenging for higher Fitzpatrick skin types because the training database had fewer images.21 To ascertain the system's reliability, it is necessary to verify what behavior is considered safe in the domains in which it operates.20 A recent study provided a comprehensive overview of ethical guidelines proposed for the implementation of artificial intelligence-based medical applications.22 Further research is required to address these matters. In addition to medical and scientific considerations, one must also be mindful of the proposed model's various socioethical implications. 
Further limitations of this study are the low case number for a deep learning approach owing to the relative rarity of the disease. This limitation also resulted in the use of only very small convolutional networks. With the small amount of data, larger neural networks might suffer from vanishing gradients, yet larger networks might also result in higher accuracy. In this context, our study could serve as a proof of concept for predicting Keratoconus progression from a single outpatient visit. A multicentric and prospective design would be a possible option for future studies to avoid bias and to gain more cases. 
In conclusion, the proposed method and our neural network indicates good results in terms of sensitivity, specificity and accuracy for predicting a future Kmax increase and the general feasibility of the approach. In addition to this, we show that our systematically developed neural network using both image and numerical data may prove helpful in determining keratoconus progression at the first visit. Future studies are warranted to determine its potential. 
Acknowledgments
The authors received institutional funding for the expenses connected to the study. 
Parts of this study have been presented and rewarded with the best paper award during the 14th EuCornea congress, Barcelona, 2023. 
Disclosure: L.M. Hartmann, None; D.S. Langhans, None; V. Eggarter, None; T.J. Freisenich, None; A. Hillenmayer, None; S.F. König, None; E. Vounotrypidis, None; A. Wolf, research grants and study fees from Allergan, Boehringer Ingelheim, Bayer, Alimera, Novartis, Oertli and Roche and consulting honoraria and travel fees from Alimera, Allergan, Bayer, Boehringer Ingelheim, Novartis, Oertli, Roche, and Zeiss (all of these are not connected to this article); C.M. Wertheimer, None 
References
Santodomingo-Rubido J, Carracedo G, Suzaki A, et al. Keratoconus: an updated review. Cont Lens Anterior Eye. 2022; 45(3): 101559. [CrossRef] [PubMed]
Romero-Jiménez M, Santodomingo-Rubido J, Wolffsohn JS. Keratoconus: a review. Cont Lens Anterior Eye. 2010; 33(4): 157–166; quiz 205. [CrossRef] [PubMed]
Duncan JK, Belin MW, Borgstrom M. Assessing progression of keratoconus: novel tomographic determinants. Eye Vis (Lond). 2016; 3: 6. [CrossRef] [PubMed]
Sorkin N, Varssano D. Corneal collagen crosslinking: a systematic review. Ophthalmologica. 2014; 232(1): 10–27. [CrossRef] [PubMed]
Gomes JA, Tan D, Rapuano CJ, et al. Global consensus on keratoconus and ectatic diseases. Cornea. 2015; 34(4): 359–369. [CrossRef] [PubMed]
Ferdi AC, Nguyen V, Gore DM, Allan BD, Rozema JJ, Watson SL. Keratoconus natural progression: a systematic review and meta-analysis of 11 529 eyes. Ophthalmology. 2019; 126(7): 935–945. [CrossRef] [PubMed]
Stahlschmidt SR, Ulfenborg B, Synnergren J. Multimodal deep learning for biomedical data fusion: a review. Brief Bioinform. 2022; 23(2): bbab569. [CrossRef] [PubMed]
Anwar SM, Majid M, Qayyum A, Awais M, Alnowami M, Khan MK. Medical image analysis using convolutional neural networks: a review. J Med Syst. 2018; 42(11): 226. [CrossRef] [PubMed]
Kuo BI, Chang WY, Liao TS, et al. Keratoconus screening based on deep learning approach of corneal topography. Transl Vis Sci Technol. 2020; 9(2): 53. [CrossRef] [PubMed]
Kamiya K, Ayatsuka Y, Kato Y, et al. Prediction of keratoconus progression using deep learning of anterior segment optical coherence tomography maps. Ann Transl Med. 2021; 9(16): 1287. [CrossRef] [PubMed]
Khamar P, Rao K, Wadia K, et al. Advanced epithelial mapping for refractive surgery. Indian J Ophthalmol. 2020; 68(12): 2819–2830. [CrossRef] [PubMed]
Kato N, Masumoto H, Tanabe M, et al. Predicting keratoconus progression and need for corneal crosslinking using deep learning. J Clin Med. 2021; 10(4): 844. [CrossRef] [PubMed]
Soni P, Tewari Y, Krishnan D. Machine learning approaches in stock price prediction: a systematic review. J Phys: Conf Ser. 2022; 2161: 012065, doi:10.1088/1742-6596/2161/1/012065.
Ferreira FGDC, Gandomi AH, Cardoso RTN. Artificial intelligence applied to stock market trading: a review. IEEE Access. 2021; 9: 30898–30917. [CrossRef]
Bilgihan K, Hondur A, Sul S, Ozturk S. Pregnancy-induced progression of keratoconus. Cornea. 2011; 30(9): 991–994. [CrossRef] [PubMed]
Shen J, Shafiq MO. Short-term stock market price trend prediction using a comprehensive deep learning system. J Big Data. 2020; 7(1): 66. [CrossRef] [PubMed]
Crahay FX, Debellemaniere G, Tobalem S, Ghazal W, Moran S, Gatinel D. Quantitative comparison of corneal surface areas in keratoconus and normal eyes. Sci Rep. 2021; 11(1): 6840. [CrossRef] [PubMed]
Tomidokoro A, Oshika T, Amano S, Higaki S, Maeda N, Miyata K. Changes in anterior and posterior corneal curvatures in keratoconus. Ophthalmology. 2000; 107(7): 1328–1332. [CrossRef] [PubMed]
Tillmann A, DanielKampik D, Borrelli M, et al. Acute corneal melt and perforation - a possible complication after riboflavin/UV-A crosslinking (CXL) in keratoconus. Am J Ophthalmol Case Rep. 2022; 28: 101705. [CrossRef] [PubMed]
Keskinbora K, Güven F. Artificial intelligence and ophthalmology. Turk J Ophthalmol. 2020; 50(1): 37–43. [CrossRef] [PubMed]
Rezk E, Eltorki M, El-Dakhakhni W. Leveraging artificial intelligence to improve the diversity of dermatological skin color pathology: protocol for an algorithm development and validation study. JMIR Res Protoc. 2022; 11(3): e34896. [CrossRef] [PubMed]
Crossnohere NL, Elsaid M, Paskett J, Bose-Brill S, Bridges JFP. Guidelines for artificial intelligence in medicine: literature review and content analysis of frameworks. J Med Internet Res. 2022; 24(8): e36823. [CrossRef] [PubMed]
Figure 1.
 
Image and clinical risk factor data were preprocessed and used as input to the neural network. First, Scheimpflug images and numerical tabular data were processed separately in two networks. Then, they were concatenated and processed in a multilayer perceptron. Finally, performance was evaluated using a test dataset that the network had not seen before.
Figure 1.
 
Image and clinical risk factor data were preprocessed and used as input to the neural network. First, Scheimpflug images and numerical tabular data were processed separately in two networks. Then, they were concatenated and processed in a multilayer perceptron. Finally, performance was evaluated using a test dataset that the network had not seen before.
Figure 2.
 
Receiver operating characteristic (ROC) curves for our neural network are displayed. The ROC curve shown represent performances achieved with images only (red), clinical risk factors only (blue), and images and clinical risk factors combined (green).
Figure 2.
 
Receiver operating characteristic (ROC) curves for our neural network are displayed. The ROC curve shown represent performances achieved with images only (red), clinical risk factors only (blue), and images and clinical risk factors combined (green).
Figure 3.
 
Mean (blue cross) and median (red line) accuracies for all 127 possible combinations of the seven clinical risk factors in the ablation study are presented. The first box blot represents all possible combinations, while columns two through eight display the mean and median accuracies for combinations that either included (A) or excluded (B) a given factor. Results show that the highest mean accuracy was achieved in combinations that included age, Kmax, and posterior elevation. Omitting any one of these three factors from the combination led to lower accuracy as compared with the mean and median accuracy of all other groups.
Figure 3.
 
Mean (blue cross) and median (red line) accuracies for all 127 possible combinations of the seven clinical risk factors in the ablation study are presented. The first box blot represents all possible combinations, while columns two through eight display the mean and median accuracies for combinations that either included (A) or excluded (B) a given factor. Results show that the highest mean accuracy was achieved in combinations that included age, Kmax, and posterior elevation. Omitting any one of these three factors from the combination led to lower accuracy as compared with the mean and median accuracy of all other groups.
Figure 4.
 
The contribution of the clinical risk factors is presented in a matrix plot. A multilayer perceptron was trained and tested in the ablation study, using all 127 possible combinations of the seven clinical risk factors. The y-axis displays the various combinations with each row representing one combination, while the x-axis displays the different risk factors with each column representing one (age = 1, Kmax = 2, pachymetry = 3, sex = 4, K1 = 5, K2 = 6, posterior elevation = 7). If a feature is excluded from a particular combination, it is displayed in black. The results for the maximum accuracy are color coded for each row, as shown on the left. It is evident that relying on a single factor is insufficient for the network to predict progression. However, combining only age and posterior elevation achieves a 0.79 accuracy. The addition of Kmax and other factors can significantly improve accuracy. It is noteworthy that excluding sex yields the two highest accuracies.
Figure 4.
 
The contribution of the clinical risk factors is presented in a matrix plot. A multilayer perceptron was trained and tested in the ablation study, using all 127 possible combinations of the seven clinical risk factors. The y-axis displays the various combinations with each row representing one combination, while the x-axis displays the different risk factors with each column representing one (age = 1, Kmax = 2, pachymetry = 3, sex = 4, K1 = 5, K2 = 6, posterior elevation = 7). If a feature is excluded from a particular combination, it is displayed in black. The results for the maximum accuracy are color coded for each row, as shown on the left. It is evident that relying on a single factor is insufficient for the network to predict progression. However, combining only age and posterior elevation achieves a 0.79 accuracy. The addition of Kmax and other factors can significantly improve accuracy. It is noteworthy that excluding sex yields the two highest accuracies.
Figure 5.
 
Four representative images of posterior elevation maps overlayed with the specific class activation maps. Two of the keratoconus samples were stable (A and B), while two were progressive (C and D). Green, yellow, and red colors indicate channels with higher activations, whereas blue denotes low activation. The results reveal that the convolutional neural network is triggered by point of highest posterior elevation in both groups, albeit slightly expanded to the superior region. The deviation from the best fit sphere in this area could cause a noticeable change in the color of the elevation maps, suggesting the presence of more advanced keratoconus.
Figure 5.
 
Four representative images of posterior elevation maps overlayed with the specific class activation maps. Two of the keratoconus samples were stable (A and B), while two were progressive (C and D). Green, yellow, and red colors indicate channels with higher activations, whereas blue denotes low activation. The results reveal that the convolutional neural network is triggered by point of highest posterior elevation in both groups, albeit slightly expanded to the superior region. The deviation from the best fit sphere in this area could cause a noticeable change in the color of the elevation maps, suggesting the presence of more advanced keratoconus.
Table.
 
A Summary of all Neural Network Designs Is Provided, Including the Maximum Performance Measurements Achieved
Table.
 
A Summary of all Neural Network Designs Is Provided, Including the Maximum Performance Measurements Achieved
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×