Open Access
Articles  |   April 2022
Prediction of Subjective Refraction From Anterior Corneal Surface, Eye Lengths, and Age Using Machine Learning Algorithms
Author Affiliations & Notes
  • Julián Espinosa
    IUFACyT, Universidad de Alicante, San Vicente del Raspeig, Spain
    Departamento de Óptica, Farmacología y Anatomía, Universidad de Alicante, San Vicente del Raspeig, Spain
  • Jorge Pérez
    IUFACyT, Universidad de Alicante, San Vicente del Raspeig, Spain
    Departamento de Óptica, Farmacología y Anatomía, Universidad de Alicante, San Vicente del Raspeig, Spain
  • Asier Villanueva
    IUFACyT, Universidad de Alicante, San Vicente del Raspeig, Spain
  • Correspondence: Julián Espinosa, Departamento de Óptica, Farmacología y Anatomía, Universidad de Alicante, Carretera San Vicente del Raspeig s/n, 03690 San Vicente del Raspeig - Alicante, Spain. e-mail: julian.espinosa@ua.es 
Translational Vision Science & Technology April 2022, Vol.11, 8. doi:https://doi.org/10.1167/tvst.11.4.8
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Julián Espinosa, Jorge Pérez, Asier Villanueva; Prediction of Subjective Refraction From Anterior Corneal Surface, Eye Lengths, and Age Using Machine Learning Algorithms. Trans. Vis. Sci. Tech. 2022;11(4):8. https://doi.org/10.1167/tvst.11.4.8.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: To develop a machine learning regression model of subjective refractive prescription from minimum ocular biometry and corneal topography features.

Methods: Anterior corneal surface parameters (Zernike coefficients and keratometry), axial length, anterior chamber depth, and age were posed as features to predict subjective refractions. Measurements from 355 eyes were split into training (75%) and test (25%) sets. Different machine learning regression algorithms were trained by 10-fold cross-validation, optimized, and tested. A neighborhood component analysis provided features’ normalized weights in predictions.

Results: Gaussian process regression algorithms provided the best models with mean absolute errors of around 1.00 diopters (D) in the spherical component and 0.15 D in the astigmatic components.

Conclusions: The normalized weights showed that subjective refraction can be predicted by only keratometry, age, and axial length. Increasing the topographic description detail of the anterior corneal surface implied by a high-order Zernike decomposition versus adjustment to a spherocylindrical surface is not reflected as improved subjective refraction prediction, which is poor, mainly in the spherical component. However, the highest achievable accuracy differs by only 0.75 D from that of other works with a more exhaustive eye refractive elements description. Although the chosen parameters may have not been the most efficient, applying machine learning and big data to predict subjective refraction can be risky and impractical when evaluating a particular subject at statistical extremes.

Translational Relevance: This work evaluates subjective refraction prediction by machine learning from the anterior corneal surface and ocular biometry. It shows the minimum biometric information required and the highest achievable accuracy.

RESUMEN:

Objetivo: El desarrollo de un modelo de regresión de aprendizaje automático prescripción refractiva subjetiva a partir de las características mínimas de la biometría ocular y la superficie corneal.

Métodos: Los parámetros de la superficie corneal anterior (coeficientes de Zernike y queratometría), además de longitudes axiales y de cámara anterior, edades y las refracciones subjetivas no ciclopléjicas de 355 ojos se dividieron en un conjunto de entrenamiento (75%) y otro de test (25%) y se entrenaron diferentes algoritmos de regresión de aprendizaje automático mediante validación cruzada 10 veces, se optimizaron y se probaron sobre el conjunto test.

Resultados: Los algoritmos de regresión del proceso gaussiano proporcionaron los mejores modelos con un error absoluto medio fue de alrededor de 1.00 D en el componente esférico y de 0.25 D en los componentes astigmáticos.

Conclusiones: Los pesos normalizados mostraron que la refracción subjetiva puede predecirse utilizando únicamente la queratometría, la edad y la longitud axial como características. El aumento del detalle de la descripción topográfica de la superficie corneal anterior que supone una descomposición de Zernike de alto orden frente al ajuste a una superficie esferocilíndrica realizado por queratometría no se refleja en una mejora de la predicción de la refracción subjetiva, que es pobre, en cualquier caso, principalmente en el componente esférico. Sin embargo, la máxima precisión alcanzada difiere en sólo 0,75 D de la de otros trabajos con una descripción más exhaustiva de los elementos refractivos del ojo. De todos modos, el aprendizaje automático y los datos masivos aplicados a la predicción de la refracción subjetiva pueden ser arriesgados y poco prácticos cuando se evalúa a un sujeto concreto en los extremos estadísticos, aunque los parámetros elegidos puedan no haber sido los más ineficaces.

Relevancia Traslativa: El trabajo evalúa la predicción de la refracción subjetiva mediante aprendizaje automático a partir de la superficie corneal anterior y la biometría ocular, mostrando la mínima información biométrica requerida y la máxima precisión alcanzable.

Introduction
Understanding how the human visual system works has become a recursive research task in recent years. Since the beginning of the past century, several optical models of the human eye have been developed to predict visual quality or subjective refraction at different complexity levels, and both comprise real measurements (in vivo and in vitro) and theoretical approximations.1,2 The two fundamental elements are the cornea and crystalline lens. Corneal topography can be assessed by different techniques like interferometry, ultrasonography, profile photography, holography, Placido disk principle, and Scheimpflug photography. Corneal surfaces are usually described with the keratometry3 and/or Zernike coefficients,4 cj, which result from fitting a series of Zernike polynomials to heights maps. The Zernike decomposition of a surface W can be expressed as polar coordinates5:  
\begin{equation}W\left( {\rho ,\theta } \right) = \mathop \sum \limits_{j = 0}^{p - 1} {c_j}{Z_j}\left( {\rho ,\theta } \right);\end{equation}
(1)
 
\begin{eqnarray}j &=& 0.5\left( {n\left( {n + 2} \right) + m} \right);\nonumber\\ n &=& \left\lceil {0.5\left( { - 3 + \sqrt {9 + 8j} } \right)} \right\rceil ;\nonumber\\ m &=& 2j - n\left( {n + 2} \right);\end{eqnarray}
(2)
 
\begin{equation}{Z_{n,m}}\left( {\rho ,\theta } \right) = {N_{n,m}}{R_{n,m}}\left( \rho \right){M_m}\left( \theta \right)\end{equation}
(3)
 
\begin{eqnarray}{N_{n,m}} &=& \sqrt {\frac{{2\left( {n + 1} \right)}}{{1 + {\delta _{m0}}}}} ;\nonumber\\ {\rm{\;\;\;}}{R_{n,m}}\left( \rho \right) &=& \mathop \sum \limits_{s = 0}^{0.5\left( {n - \left| m \right|} \right)} \frac{{{{\left( { - 1} \right)}^s}\left( {n - s} \right)!}}{{s!\left[ {0.5\left( {n + \left| m \right|} \right) - s} \right]!\left[ {0.5\left( {n - \left| m \right|} \right) - s} \right]!}}{\rho ^{n - 2s}};\end{eqnarray}
(4.a)
 
\begin{equation}{M_m}\left( \theta \right) = \left\{ \begin{array}{@{}*{1}{c}@{}} {\cos \left( {m\theta } \right)\;\;\;\;\;\;\;\;\;m \ge 0}\\ {\sin \left( {m\theta } \right)\;\;\;\;\;\;\;\;\;m < 0} \end{array}\right.;\end{equation}
(4.b)
where n is the radial order, m is the azimuthal frequency, j is the single index for the Zernike polynomial, p is the number of terms in the expansion, cj are the Zernike coefficients associated with their Zernike polynomial, δm0 is the Kronecker delta function, and ⌈ · ⌉ denotes the ceiling (round-up) operator. Regarding the crystalline lens, on the one hand, phakometry,6 Scheimpflug imaging,710 magnetic resonance imaging,11 and optical coherence tomography12,13 have been used to assess the in vivo optical properties of lens shape and lens thickness. On the other hand, in vitro techniques1418 have also been followed to evaluate lens shape and power. 
Subjective refraction is responsible for determining the most adequate optical power needed to compensate a patient’s refractive errors. It relies on the patient's ability to discern and communicate possible improvements or distortions caused by corrective adjustments made by the optometrist, and thus, its determination with automated machines is still not accurate. This procedure takes time and, consequently, slows down the diagnosis process. To reduce the time needed for this process, machine learning has been recently used to predict refractive prescription from physical eye data obtained by wavefront aberrometers,19,20 photorefraction images,21 retinal fundus images,22 other ophthalmologic devices,23 and intraocular lenses characteristics.24 The aim of this work is to develop a machine learning regression model that predicts patients’ subjective refractive prescription from the anterior corneal surface and ocular biometry. The first step consists of choosing physiologic descriptors as the model's features. It is a key point because using too many features can degrade prediction performance, even if all features are relevant.25 The strong correlation between the spherical power components and astigmatic components26 of the anterior and posterior corneal surfaces, as well as the fact that the latter contributes only about one-eighth of the eye's refractive power,27 leads to the hypothesis that the anterior surface suffices to assess the whole corneal refractive effect on an eye model. Apart from anterior corneal topography, other physiologic parameters that are expected to be involved in patients’ subjective refraction were added to the model: axial length (AL) and then constructing a similar approach to Emsley schematic eye; the patient's age, on which refractive indices and crystalline lens morphology depend; and, finally, anterior chamber depth (ACD), which can be related to lens location. Compared to previous works,20 this proposal offers the advantages of using simpler measurement devices, requiring fewer descriptive data and, therefore, faster performance. However, disregarding the crystalline lens effect will probably provide poor results. In fact, the distribution of some aberrations between the cornea and lens appears to be auto-compensated.28 Next, the selection and tuning of machine learning models must be performed with training and test population subsets. The Results section shows an analysis of the main indispensable characteristics and the evaluation of the final predictive accuracy of the selected models. 
Methods
The electronic medical records of 229 patients were retrieved. This study followed the Declaration of Helsinki principles, and the participants gave their written informed consent. Any patients with any ocular abnormalities, except ametropia and ever-performed cataract surgery or artificial lens implants, were excluded. The resulting selected measurements consisted of the anterior corneal surface parameters, AL, ACD, age, and noncycloplegic subjective refractions conducted by optometrists with 201 patients (355 eyes), 154 patients’ bilateral eye records and 47 patients’ unilateral eye records. Subjective refraction (S, C, α) was transformed through equation (5) to power vector notation29 (M, J0, J45), which consists of components that are independent of one another. Figure 1 shows the histograms of the age and power vector components of the population sample herein used.  
\begin{eqnarray}M &=& S + \frac{C}{2};\;\;\;\;\;\;\;\;\;{J_0} = - \frac{C}{2}\cos \left( {2\alpha } \right);\nonumber\\ {J_{45}} &=& - \frac{C}{2}\sin \left( {2\alpha } \right)\end{eqnarray}
(5)
 
Figure 1.
 
Histograms of the (a) age and (b–d) power vector components (M, J0, J45) obtained from the subjective refraction performed with the sample population.
Figure 1.
 
Histograms of the (a) age and (b–d) power vector components (M, J0, J45) obtained from the subjective refraction performed with the sample population.
Both ACD and AL were measured by an optical biometer IOLMaster 500 (Carl Zeiss Meditec, Inc., Dublin, CA, USA). Corneal surface measurements were taken with a Sirius Topographer (Sirius, CSO, Firenze, Italy). The Sirius is a Scheimpflug combined with a Placido disk imaging system that allows measurements of 256 meridians of corneas in 30 radial distances to be taken. It provides several corneal physiologic descriptors susceptible to be selected as features in machine learning (Zernike coefficients from both surfaces with different diameters, keratometry, etc.). The Zernike coefficients describing the anterior corneal surface at a pupil diameter of 4.5 mm and anterior corneal surface astigmatism (Ckk) were selected from the measurements taken by the Sirius Topographer as corneal physiologic descriptors. As the device does not provide defocus coefficient c4, we used the keratometry equivalent sphere (Mk) instead, which was computed from the keratometry for the same pupil diameter as  
\begin{equation}{M_k} = 0.5\left( {{n_k} - 1} \right)\left( {\frac{1}{{{R_f}}} + \frac{1}{{{R_s}}}} \right)\end{equation}
(6)
where nk is the keratometric index and Rf and Rs are, respectively, the flattest and the steepest anterior corneal curvature radius. The conventional keratometric index (1.3375) was used in this work. However, its value is not relevant for machine learning algorithms because it is constant. Keratometry data (Ckk) were also transformed into standard power vector notation29 according to equation (5). 
The database was split into a training subset and a test subset with 75% and 25% of the patients, respectively, to avoid overfitting. It comprised the ocular measurements of both unilateral and bilateral patients. Splitting was performed by preventing the eyes of the same patient from appearing in both subsets. The training subset consisted of 36 unilateral eyes, plus both eyes of 115 bilateral patients. The test subset included 11 unilateral eyes, plus both eyes of 39 bilateral patients. Table 1 presents the descriptive statistics that summarize the refractive data and ages of the different sets. 
Table 1.
 
Descriptive Statistics Summarizing the Subjective Refractive Data and Ages of the Different Sets
Table 1.
 
Descriptive Statistics Summarizing the Subjective Refractive Data and Ages of the Different Sets
Data set size determines the maximum number of features that can be postulated as significant characteristics in the machine learning algorithm. In this work, the training data included 266 eyes. However, a 10-fold cross-validation technique was applied to the training data to assess how the results of a statistical analysis would generalize to an independent data set. This process consists of randomly partitioning the training sample is into 10 equally sized subsamples. Of these 10 sets, a single subsample is retained as the validation data to test the model. The remaining nine sets are used as data to train the model. Therefore, the effective size of the training data is 266 × 9 / 10 ≈ 239. A rule of thumb establishes that the maximum number of parameters for a good-performing model is limited to one-tenth of the amount of training data. This left the maximum number of features at 23. 
Sirius provides Zernike coefficients up to the seventh order except piston (c0) and defocus (c4), that is, 34 coefficients from the anterior corneal surface decomposition that are above the maximum limit of 23 features. Bearing in mind that a patient’s age, Mk, ACD, and AL are all features that characterize the eye apart from cornea, a thus-detailed anterior corneal surface description can be foregone to meet that limit. Therefore, those coefficients, and excluding tilts (c1 and c2) for their little relevance as they can be naturally compensated by eye movements, were selected in ascending order without exceeding that limit. The fifth order includes 17 Zernike coefficients that, together with age, Mk, ACD, and AL, make up a set of 21 selected features as a first approach, to characterize each eye and to train the models. As part of the preprocessing data step, a filter-type feature selection algorithm30 that used a diagonal adaptation of the neighborhood component analysis31 was applied to determine the features’ normalized weights (NWs). This reports about the importance of each feature in the regression models of the power vector components and allows different feature selections to train and tune the new models to be run. An extra parameter with a random value (rv) was added to check the significance of the 21 selected features in the model. The NW of the extra random parameter is expected to be zero. If not, any feature with an NW that equals or goes below that can be discarded because its significance would be equal or worse than that of a random variable. 
Regression models were trained with the regression learner app32 of MATLAB (version R2021b; MathWorks, Inc., Natick, MA, USA), which provides linear regression models,33 regression trees,34 support vector machines (SVMs),35 Gaussian process regression (GPR) models,36 ensembles of trees,3739 and neural networks.40 Algorithms were trained to predict each refraction vector component separately because of their mathematical independence. The models that obtained the best root mean squared errors (RMSEs) were later hyperparameter-tuned by Bayesian optimization. Finally, the best models were tested with the test subset data. Otherwise, a model might only perform well with the training data but may fail to predict anything useful in yet unseen data. 
Results
The neighborhood component analysis feature selection with regularization was applied to determine feature importance to predict the power vector components of the subjective refraction. Figure 2 is a bar graph showing the computed NWs of all 21 features, plus the random one to predict the three power vector components. 
Figure 2.
 
Normalized weights of the features for predicting the three power vector components. cj stands for the jth Zernike coefficient in line with (1).
Figure 2.
 
Normalized weights of the features for predicting the three power vector components. cj stands for the jth Zernike coefficient in line with (1).
The next step consisted of determining the machine learning algorithm that worked the best. Different regression models were trained using the regression learner app32 with the 266 eyes from the training subset characterized by the 21 features and by following the above-described procedure. Table 2 shows the RMSE and the coefficient of determination (R2) values obtained for each power vector component with the trained models, which gave better results. R2 compares the trained model to the model with a constant response, and it equalled the training response mean. If the model is worse than this constant model, then R2 is negative and the model is discarded. 
Table 2.
 
RMSE and Coefficient of Determination (R2) Values Obtained for Each Power Vector for the Training Subset With All the Trained Models
Table 2.
 
RMSE and Coefficient of Determination (R2) Values Obtained for Each Power Vector for the Training Subset With All the Trained Models
According to the RMSE, the GPR models provided the best results in the first approaches for the three power vector components, although other models approximately matched those values for some components. Therefore, the GPR, Ensembles, and SVM models were subsequently optimized by hyperparameter tuning to obtain the best results in the GPR models. The results appear in the last row of Table 2. In Figures 3a and 3b, the responses obtained with the optimized GPR models for the power vector components of the training set are represented versus the true data. 
Figure 3.
 
Predicted responses obtained with the optimized GPR models versus the true data of the training set for the three power vector components.
Figure 3.
 
Predicted responses obtained with the optimized GPR models versus the true data of the training set for the three power vector components.
The performance of the optimized models in the test subset (89 eyes) was also evaluated. Figures 4a and 4b show the predicted responses versus the true response for the three power vector components. Table 3 includes the RMSE, mean absolute error (MAE), and R2 obtained from the test subset for each power vector through the optimized GPR model trained for 21 features. 
Figure 4.
 
Predicted responses obtained with the optimized GPR models versus the true data of the test set for the three power vector components.
Figure 4.
 
Predicted responses obtained with the optimized GPR models versus the true data of the test set for the three power vector components.
Table 3.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model and the Proposed Features
Table 3.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model and the Proposed Features
Discussion and Conclusions
Machine learning regression algorithms were been trained to predict the subjective refraction prescription from the anterior corneal shape parameters, eye lengths, and the patients’ age. The Gaussian process regression algorithms provided the best models with different accuracies for each subjective refraction power vector component. Algorithms were applied to a small data set compared to other studies found in the bibliography. However, data adhered to the rule of 10 in the worst case, namely, the amount of training data needed for a good-performing model being 10 times the number of parameters in the model. Therefore, the performance of the obtained models was reliable. 
A selection of the features with weight that appears in Figure 2 would provide equal or better results when performing the regression models for each power vector component than those from the 21 features. The effect of neglecting some of those features can be assessed by establishing different weight thresholds to make the selections. Two different thresholds were set. The first comprised features with NWs above zero. The second contained those with NWs above 20%, which depreciated the ACD effect, with higher-order aberrations c6 (vertical trefoil), c7 (vertical coma), c12 (primary spherical), c13 (vertical secondary astigmatism), and c19 (oblique secondary trefoil). Hence, the M component features were consecutively in NW importance terms AL, the keratometry equivalent sphere, and age. Both J0 and J45 were respectively modeled using the primary astigmatism Zernike coefficient (c3 and c5). The independency of the three components was evidenced because they showed different NWs of features. Hence, a different subset of features could be selected for each one to perform the regression models. However, common selections of features have been sought for three components for simplicity's sake and to obtain only a set of features for each threshold. Table 4 shows the results, for the test subset, of the trained and optimized GPR models by considering features according to these two different thresholds for the three power vector components. 
Table 4.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model With Different Eye Features
Table 4.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model With Different Eye Features
The fact that the regression performed with the features of NW >0 slightly improved RMSE is the first point to be noted. As expected, using too many features degrades prediction performance. Indeed, the R2 of component J45 improved. The results of the optimized GPR model with all features and of that with those of NW >20% (five features) were similar. Consequently, a regression model with only these five parameters can achieve maximum accuracy. Disregarding high-order aberrations and ACD did not significantly make the results worse. 
Based on the above result, good results can be expected by considering age, AL, and keratometry, which describes the corneal defocus and astigmatism. Table 4 also present the obtained errors and R2 from the test subset for the models trained with keratometry, age, and AL. One again, the results were not different from those obtained using the models with all the features. 
The analysis of the computed normalized weights of features allowed the number of required parameters to be minimized. This work revealed the highest accuracy that can be obtained using the available information and confirmed some preestablished facts. The models’ predictive ability for components J0 and J45 was good (around 0.25 diopters [D] of RMSE), but the coefficient of determination for J45 was poor. For spherical component M, although R2 was good, the error (MAE around 1.00 D) seemed to be too high to employ it as a technique to predict subjective refraction. Notwithstanding, the achieved accuracies were not much worse than those reported in previous works,20 although the presented proposal used simpler measurement devices (a keratometer would work) and required fewer descriptive data. 
Our results are probably not as accurate as the models contained no data about posterior corneal surface and crystalline lens aberrations. Machine learning algorithms employed age, ACD, and AL to approach the eye's inner optical part effect. On the one hand, it proved that the geometric parameters of the human crystalline lens and refractive indices were age dependent and, therefore, so was lens refractive power. The analysis of the NW of the features showed that the NW of the age was above 20% when modeling the M component. This fact confirmed the marked dependence of the refractive spherical component on this feature. The refractive power of the missing elements in the eye model was slightly described by the age feature. A patient’s age also appeared in the NW features analysis for predicting J0 and J45. This could indicate the age dependence of astigmatism.42,43 The ACD effect, which was hypothesized to be related to the lens location in the eye, and the higher-order Zernike coefficients effect were assessed from the results obtained with the parameters of NW >0 in the M component model. The result obtained considering these effects was not significantly better (only about 0.1 D of MAE) than those obtained in the model with the features of NW >20%, which excluded these parameters in relation to the previous ones. The comparison made of the astigmatic power vector components showed that errors were no higher when those coefficients were not considered. Hence, we conclude that their influence was slight and, therefore, machine learning algorithms can dispense these features without making the predictions worse. 
On the other hand, as AL determines the location of the retina, it is fundamental to model the spherical component. Indeed, as this feature had the highest NW for the M component prediction, not taking it into account would provide poor results. It was also present for the J0 component prediction but with an NW below 20% and, hence, its poor significance. 
Astigmatic components are expected to be solely determined by the anterior corneal surface shape. However, as we have just cited, the NW analysis of features to predict J45 and J0 showed other features apart from c3 and c5 (oblique and vertical astigmatism Zernike coefficients, respectively), albeit with lower NWs. The models that optimized GPR and only employed the astigmatic components were also trained and tested. The results in Table 5 are practically no different from those obtained using all the parameters with NWs other than zero, which implies that anterior corneal surface topography determines astigmatism. These results, together with the fact that the errors of models for astigmatic components were lower than the error of the spherical ones, confirmed the strong correlation between the astigmatic components26 of the anterior and the posterior corneal surfaces. 
Table 5.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for the Astigmatic Power Vector for the Test Subset With the Optimized GPR Trained Model With c3 and c5 Coefficients
Table 5.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for the Astigmatic Power Vector for the Test Subset With the Optimized GPR Trained Model With c3 and c5 Coefficients
The weight feature analysis also allowed us to conclude that no precise corneal surface description is needed for subjective refractive prescriptions. The errors of the astigmatic components for all the evaluated models remained unchanged, regardless of whether keratometry or a high-order anterior corneal Zernike decomposition was considered or not. The obtained MAE values were of the same order as those obtained by Rampat et al.,20 who contemplated whole-eye aberrations. Therefore, the contribution of the posterior corneal surface and the crystalline lens to the eye's astigmatism falls within those errors. For the spherical component, the difference with that work20 was below 0.75 D. Therefore, by assuming the strong correlation between the spherical power26 of corneal surfaces, disregarding the lens would be partially, but not completely, compensated by considering age. The difference in patients’ demographics between both works lay in them having a refractive spherical equivalent between −6.75 D and 6.13 D, which, in this work, is between −11 D and 6.25 D. This could also be a cause of the worse results herein obtained for the widest range of ametropies. 
Machine learning allowed us to confirm that the low-order aberrations obtained from anterior corneal surfaces, together with age and AL, suffice to approximately predict subjective refraction. This is an advantage in diagnosis speed terms over the subjective traditional technique. The roles of the proposed features in modeling each power vector component were assessed. 
The traditional subjective refraction is nowadays still considered the universal gold standard in the evaluation of refractive error. However, it is time-consuming, taking about 5 minutes per eye44 by a well-trained eye care professional, and its subjective nature implies deviations in the spherical component around ±0.25 D45,46 and in the astigmatic component above 0.75 D47 in both intra- and interoptometrist variability. 
The aim of this work was to develop of a machine learning regression model that predicts patients’ subjective refractive prescription from the anterior corneal surface and ocular biometry. It is an objective technique that can be used to measure the spherical and cylindrical refractive errors of the human eye. Objective refraction is not only useful but often essential, for example, when examining young children and patients with mental or language difficulties. Cycloplegic refraction, for example, requires little or no cooperation from the patient. This is an advantage over other techniques, such as the one proposed in this study, which requires biometry and corneal surface measurements. However, the administration of cycloplegics may cause undesirable ocular and/or systemic side effects. Moreover, although objective refraction can provide good visual outcomes, neural processing must also be considered. 
Our proposal, compared to the work of Rampat et al.,20 offers the advantages of using simpler measurement devices, requiring fewer descriptive data, and, therefore, having faster performance. It fits in with the unsupervised methods’ trend of minimizing misunderstandings between the clinician and the patient and the measurement variability and time. However, we must consider that the proposed approaches statistically predict subjective refraction within a tolerance range for a population, and, individually, the errors that they may make can be intolerable. So although the choice of input parameters in this work might not be the most predictive efficient, a general model should clearly show its extreme errors and not merely its average performance. Otherwise, establishing a standard predictive method based on machine learning algorithms and big data can be risky and impractical when evaluating a particular patient. 
Acknowledgments
The authors thank Vissum Grupo Miranza (Alicante) for providing the electronic medical records. 
Disclosure: J. Espinosa, None; J. Pérez, None; A. Villanueva, None 
References
Atchison DA. Optical models for human myopic eyes. Vision Res. 2006; 46(14): 2236–2250. [CrossRef] [PubMed]
Atchison DA, Thibos LN. Optical models of the human eye. Clin Exp Optom. 2016; 99(2): 99–106. [CrossRef] [PubMed]
Schwiegerling J. Field Guide to Visual and Ophthalmic Optics. Bellingham: SPIE; 2004.
Schwiegerling J, Greivenkamp JE, Miller JM. Representation of videokeratoscopic height data with Zernike polynomials. JOSA A. 1995; 12(10): 2105–2113. [CrossRef] [PubMed]
Thibos LN, Applegate RA, Schwiegerling JT, Webb R; VSIA Standards Taskforce Members. Vision science and its applications: standards for reporting the optical aberrations of eyes. J Refract Surg. 2002; 18(5): S652–S660. [CrossRef] [PubMed]
Mutti DO, Zadnik K, Adams AJ. A video technique for phakometry of the human crystalline lens. Invest Ophthalmol Vis Sci. 1992; 33(5): 1771–1782. [PubMed]
Dubbelman M, Van der Heijde GL. The shape of the aging human lens: curvature, equivalent refractive index and the lens paradox. Vis Res. 2001; 41(14): 1867–1877. [CrossRef] [PubMed]
Koretz JF, Cook CA, Kaufman PL. Aging of the human lens: changes in lens shape at zero-diopter accommodation. JOSA A. 2001; 18(2): 265–272. [CrossRef] [PubMed]
Hermans E, Dubbelman M, van der Heijde R, Heethaar R. The shape of the human lens nucleus with accommodation. J Vis. 2007; 7(10): 16. [CrossRef] [PubMed]
Marussich L, Manns F, Nankivil D, et al. Measurement of crystalline lens volume during accommodation in a lens stretcher. Invest Ophthalmol Vis Sci. 2015; 56(8): 4239–4248. [CrossRef] [PubMed]
Jones CE, Atchison DA, Meder R, Pope JM. Refractive index distribution and optical properties of the isolated human lens measured using magnetic resonance imaging (MRI). Vis Res. 2005; 45(18): 2352–2366. [CrossRef] [PubMed]
Richdale K, Bullimore MA, Zadnik K. Lens thickness with age and accommodation by optical coherence tomography. Ophthalmic Physiol Opt. 2008; 28(5): 441–447. [CrossRef] [PubMed]
Lehman BM, Berntsen DA, Bailey MD, Zadnik K. Validation of OCT-based Crystalline lens thickness measurements in children. Optom Vis Sci. 2009; 86(3): 181. [CrossRef] [PubMed]
Howcroft MJ, Parker JA. Aspheric curvatures for the human lens. Vis Res. 1977; 17(10): 1217–1223. [CrossRef] [PubMed]
Sivak JG, Kreuzer RO. Spherical aberration of the crystalline lens. Vis Res. 1983; 23(1): 59–70. [CrossRef] [PubMed]
Pierscionek BK, Chan DY. Refractive index gradient of human lenses. Optom Vis Sci. 1989; 66(12): 822–829. [CrossRef] [PubMed]
Borja D, Manns F, Ho A, et al. optical power of the isolated human crystalline lens. Invest Ophthalmol Vis Sci. 2008; 49(6): 2541–2548. [CrossRef] [PubMed]
Martinez-Enriquez E, de Castro A, Mohamed A, et al. Age-related changes to the three-dimensional full shape of the isolated human crystalline lens. Invest Ophthalmol Vis Sci. 2020; 61(4): 11. [CrossRef] [PubMed]
Leube A, Leibig C, Ohlendorf A, Wahl S. Machine learning based predictions of subjective refractive errors of the human eye. In: Moucek R, Fred A, Gamboa H, eds. 12th International Conference on Health Informatics. Prague: SciTePress; 2019: 199–205.
Rampat R, Debellemanière G, Malet J, Gatinel D. Using artificial intelligence and novel polynomials to predict subjective refraction. Sci Rep. 2020; 10(1): 8565. [CrossRef] [PubMed]
Chun J, Kim Y, Shin KY, et al. Deep learning–based prediction of refractive error using photorefraction images captured by a smartphone: model development and validation study. JMIR Med Inform. 2020; 8(5): e16225. [CrossRef] [PubMed]
Varadarajan AV, Poplin R, Blumer K, et al. Deep learning for predicting refractive error from retinal fundus images. Invest Ophthalmol Vis Sci. 2018; 59(7): 2861–2868. [CrossRef] [PubMed]
de Araujo AL, Santos HDP, Sganzerla D, Umpierre RN, Schor P. Development of a multivariable prediction model to predict subjective refraction in patients with refractive errors. Invest Ophthalmol Vis Sci. 2020; 61(7): 5172–5172.
Yamauchi T, Tabuchi H, Takase K, Masumoto H. Use of a machine learning method in predicting refraction after cataract surgery. J Clin Med. 2021; 10(5): 1103. [CrossRef] [PubMed]
Kuhn M, Johnson K. Applied Predictive Modeling. New York: Springer; 2013.
Mas D, Espinosa J, Domenech B, Perez J, Kasprzak H, Illueca C. Correlation between the dioptric power, astigmatism and surface shape of the anterior and posterior corneal surfaces. Ophthalmic Physiol Opt. 2009; 29(3): 219–226. [CrossRef] [PubMed]
He JC. A theoretical study on contribution of the posterior corneal surface to wavefront aberrations of the eye. Invest Ophthalmol Vis Sci. 2007; 48(13): 2792. [PubMed]
Artal P, Benito A, Tabernero J. The human eye is an example of robust optical design. J Vis. 2006; 6(1): 1. [CrossRef] [PubMed]
Thibos LN, Wheeler W, Horner D. Power vectors: an application of Fourier analysis to the description and statistical analysis of refractive error. Optom Vis Sci. 1997; 74(6): 367–375. [CrossRef] [PubMed]
Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003; 3: 1157–1182.
Yang W, Wang K, Zuo W. Neighborhood component feature selection for high-dimensional data. J Comput. 2012; 7(1): 161–168.
MathWorks. Regression learner app—MATLAB & Simulink. https://www.mathworks.com/help/stats/regression-learner-app.html. Accessed July 27, 2021.
Huber PJ. Robust statistics. In: Lovric M, ed. International Encyclopedia of Statistical Science. Heidelberg: Springer; 2011: 1248–1251.
Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and Regression Trees. Boca Ratón: Chapman and Hall/CRC; 1984.
Vapnik V. The Nature of Statistical Learning Theory. 2nd ed. New York: Springer-Verlag; 2000.
Rasmussen CE, Williams CKI. Gaussian Processes for Machine Learning. Cambridge, MA: MIT Press; 2005.
Breiman L. Bagging predictors. Mach Learn. 1996; 24(2): 123–140.
Breiman L. Random forests. Mach Learn. 2001; 45(1): 5–32. [CrossRef]
Hastie T, Tibshirani R, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer; 2009.
Nocedal J, Wright S. Numerical Optimization. 2nd ed. New York: Springer-Verlag; 2006.
MathWorks. Choose regression model options—MATLAB & Simulink—MathWorks United Kingdom. https://uk.mathworks.com/help/stats/choose-regression-model-options.html. Accessed November 27, 2021.
Shao X, Zhou KJ, Pan AP, et al. Age-related changes in corneal astigmatism. J Refract Surg. 2017; 33(10): 696–703. [CrossRef] [PubMed]
Namba H, Sugano A, Murakami T, et al. Age-related changes in astigmatism and potential causes. Cornea. 2020; 39: S34. [CrossRef] [PubMed]
Carracedo G, Carpena-Torres C, Serramito M, Batres-Valderas L, Gonzalez-Bergaz A. Comparison between aberrometry-based binocular refraction and subjective refraction. Transl Vis Sci Technol. 2018; 7(4): 11. [CrossRef] [PubMed]
Rosenfield M, Chiu NN. Repeatability of subjective and objective refraction. Optom Vis Sci. 1995; 72(8): 577–579. [CrossRef] [PubMed]
Zadnik K, Mutti DO, Adams AJ. The repeatability of measurement of the ocular components. Invest Ophthalmol Vis Sci. 1992; 33(7): 2325–2333. [PubMed]
MacKenzie GE. Reproducibility of sphero-cylindrical prescriptions. Ophthalmic Physiol Opt. 2008; 28(2): 143–150. [CrossRef] [PubMed]
Figure 1.
 
Histograms of the (a) age and (b–d) power vector components (M, J0, J45) obtained from the subjective refraction performed with the sample population.
Figure 1.
 
Histograms of the (a) age and (b–d) power vector components (M, J0, J45) obtained from the subjective refraction performed with the sample population.
Figure 2.
 
Normalized weights of the features for predicting the three power vector components. cj stands for the jth Zernike coefficient in line with (1).
Figure 2.
 
Normalized weights of the features for predicting the three power vector components. cj stands for the jth Zernike coefficient in line with (1).
Figure 3.
 
Predicted responses obtained with the optimized GPR models versus the true data of the training set for the three power vector components.
Figure 3.
 
Predicted responses obtained with the optimized GPR models versus the true data of the training set for the three power vector components.
Figure 4.
 
Predicted responses obtained with the optimized GPR models versus the true data of the test set for the three power vector components.
Figure 4.
 
Predicted responses obtained with the optimized GPR models versus the true data of the test set for the three power vector components.
Table 1.
 
Descriptive Statistics Summarizing the Subjective Refractive Data and Ages of the Different Sets
Table 1.
 
Descriptive Statistics Summarizing the Subjective Refractive Data and Ages of the Different Sets
Table 2.
 
RMSE and Coefficient of Determination (R2) Values Obtained for Each Power Vector for the Training Subset With All the Trained Models
Table 2.
 
RMSE and Coefficient of Determination (R2) Values Obtained for Each Power Vector for the Training Subset With All the Trained Models
Table 3.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model and the Proposed Features
Table 3.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model and the Proposed Features
Table 4.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model With Different Eye Features
Table 4.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for Each Power Vector for the Test Subset With the Optimized GPR Trained Model With Different Eye Features
Table 5.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for the Astigmatic Power Vector for the Test Subset With the Optimized GPR Trained Model With c3 and c5 Coefficients
Table 5.
 
RMSE, MAE, and Coefficient of Determination (R2) Obtained for the Astigmatic Power Vector for the Test Subset With the Optimized GPR Trained Model With c3 and c5 Coefficients
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×