Abstract
Purpose:
Descemet membrane endothelial keratoplasty (DMEK) is the preferred method for treating corneal endothelial dysfunction, such as Fuchs endothelial corneal dystrophy (FECD). The surgical indication is based on the patients’ symptoms and the presence of corneal edema. We developed an automated tool based on deep learning to detect edema in corneal optical coherence tomography images. This study aimed to evaluate this approach in edema detection before Descemet membrane endothelial keratoplasty surgery, for patients with or without FECD.
Methods:
We used our previously described model allowing to classify each pixel in the corneal optical coherence tomography images as “normal” or “edema.” We included 1992 images of normal and preoperative edematous corneas. We calculated the edema fraction (EF), defined as the ratio between the number of pixels labeled as “edema,” and those representing the cornea for each patient. Differential central corneal thickness (DCCT), defined as the difference in central corneal thickness before and 6 months after surgery, was used to quantify preoperative edema. AUC of EF for the edema detection was calculated for Several DCCT thresholds and a value of 20 µm was selected to define significant edema as it provided the highest area under the curve value.
Results:
The area under the curve of the receiver operating characteristic curve for EF for the detection of 20 µm of DCCT was 0.97 for all patients, 0.96 for Fuchs and normal only and 0.99 for non-FECD and normal patients. The optimal EF threshold was 0.143 for all patients and patients with FECD.
Conclusions:
Our model is capable of objectively detecting minimal corneal edema before Descemet membrane endothelial keratoplasty surgery.
Translational Relevance:
Deep learning can help to interpret optical coherence tomography scans and aid the surgeon in decision-making.
This study was conducted retrospectively at the Rothschild Foundation Hospital in Paris, France, in accordance with the tenets of the 1964 Declaration of Helsinki. The research was approved by the Rothschild Foundation Hospital Review Board (IRB 00012671). Informed consent was obtained from all participants.
We collected data as follows for the pre-DMEK group: we searched for patients who underwent DMEK surgery in our corneal graft registry between October 2017 and June 2020. The recorded data were reviewed manually to include only patients with a successful surgery defined by an increase in visual acuity and a reduction of corneal thickness 6 months after surgery, and an available pachymetry-wide or pachymetry examination obtained from the Avanti OCT (RTVue, Optovue, Fremont, CA) before DMEK surgery. Each examination is composed of eight radial scans evenly spaced by 22.5°.
We included normal cases with the following exclusion criteria: history of ocular surgery or trauma, any corneal disease including dystrophies, infectious keratitis, dry eye disease, and contact lens wear.
Image resolution was 1536 × 640 pixels for the 9-mm scans and 1020 × 640 pixels for the 6-mm scans. All 9-mm scans were cropped laterally to the central 1020 pixels to obtain the same size as the 6-mm scans. No other preprocessing technique was applied to the images.
For both the normal and pre-DMEK groups we report the principal characteristics: number of patients and eyes, mean age, and mean CCT. The pre-DMEK patients were organized by clinical category: FECD, decompensation after cataract surgery, and decompensation after anterior chamber intraocular lens Artisan. We reported for each one: the number of reintervention or triple procedures (DMEK and phacoemulsification), presence of clinical edema, mean postoperative CCT, and preoperative best-corrected visual acuity (BCVA) at 3 months and at 6 months.
All images were screened by our algorithm. The number of pixels predicted as edema and normal, as well as the total number of pixels representing the cornea were counted for each image. To remove floating isolated artifacts in the background that would skew the pixel count results, we created a binary mask in which pixels predicted as edema or normal were set to 1 and the other to 0. Then, only the largest connected component of that mask was used to count the pixels. The main outcome measure was the edema fraction (EF), defined as the ratio between the number of pixels labeled as edema and those representing the cornea averaged over all scans of each patient. Visualization of the results are presented as a color map of the same size as the original image. A colorimetric scale as described previously
29 was used to represent the output values of the network with hot colors reflecting a high probability of edema and cold colors a lower one.
To quantify corneal edema objectively before surgery and account for subclinical edema, we used the differential CCT (DCCT), defined by the difference between preoperative and postoperative mean CCT measured in the 3 central mm of the cornea by the OCT device. It should be noted that we included successful surgeries only, because by considering that surgery is a success, we hypothesize that there is no residual edema on postoperative images. Therefore, DCCT should provide an accurate measurement of the amount of central preoperative edema. Because the Descemet membrane can be thickened in FECD,
33 it is possible that some patients have a minimal DCCT with no real edema. Therefore, the optimal threshold to define edema using DCCT is not known. Hence, to compare the EF with DCCT, we used a receiver operating characteristic curve (ROC) analysis of the EF with thresholding based on different DCCTs values of 0, 20, 25, and 30 µm. The DCCT threshold having the highest area under the curve (AUC) was used to define significant edema for the rest of the study. Normal patients were considered to have a DCCT of 0 µm to be comparable with the other patients.
To compare the efficiency of our method in cases of minimal edema to the existing Scheimpflug based classification described by Sun et al.,
14 we included only patients with an available Scheimpflug (Pentacam, Ocumus, Wetzlar, Germany) examination performed on the same day as the preoperative OCT. Scheimpflug maps were classified by one corneal surgeon (P.Z.) with more than 5 years of experience in DMEK surgery, who applies this classification in his routine practice. The reader was blinded to the diagnosis and DCCT and EF values. The features of this classification are (1) loss of parallel isopachs on pachymetry map, (2) displacement of the thinnest point of the cornea, and (3) focal posterior corneal surface depression. To compare the results with our model's EF, which is bounded by 0 and 1, we added the number of observed features for each examination and divided the result by 3 to obtain a value between 0 and 1. It should be noted that no other Scheimpflug based parameter was used as these are the one used routinely in our department. Also, CCT was measured with OCT and not with Pentacam because it was available both before and after surgery for all cases, which ensures a better reliability of the DCCT value.
Age, preoperative and postoperative CCT, DCCT, preoperative BCVA, and BCVA at 3 and 6 months are presented as mean ± standard deviation. The preoperative EF was calculated for every patient. Comparisons were performed between normal and edematous patients and between the different groups of edematous patients. Comparisons between preoperative and postoperative variables were also performed for each group. Finally, comparisons between patients with or without clinical edema were performed for each variable. A Student t-test was used for comparison when the data followed a Gaussian distribution according to the D'Agostino–Pearson test, a Mann–Whitney U test was used otherwise, and analysis of variance was used to compare the different clinical groups. Tukey's adjustment was applied to account for multiple comparisons. Proportions were compared using the χ2 test.
ROC curves were calculated for three different settings: for all patients, FECD and normal only, and non-FECD preoperative patients (others) and normal only. The AUC, optimal threshold, sensitivity, and specificity are reported in each case. The Pearson correlation coefficient (r) was used to assess correlation between EF and the results of the Scheimpflug classification, EF and preoperative BCVA and preoperative pachymetry and BCVA.
P values of less than 0.05 were considered statically significant. Statistical analyses were performed with the online application EasyMedStat (version 3.4;
www.easymedstat.com) and Stata software. Plots and heat map representations were made using Seaborn and Matplotlip libraries in Python 3.6.
We showed that our deep learning model performed optimally in the detection of a preoperative corneal edema corresponding to at least 20 microns of corneal thickness. Few other studies described artificial intelligence models designed to improve decision-making in corneal keratoplasty surgery. One study
23 described an unsupervised pipeline to cluster observations in groups of likelihood of future keratoplasty using OCT-based parameters. Although the results are interesting, it should be noted that the unsupervised algorithm used does not allow the a posteriori analysis of a new examination. This makes its usability limited in clinical practice. Eleiwa et al.
28 reported a deep learning algorithm capable of detecting clinically visible edema as well as FECD as accurately as ophthalmologists. This work describes an interesting automation of the clinical assessment of FECD and corneal edema but does not provide additional help to the clinician in the decision-making process. Regarding DMEK surgery specifically, we believe that detection of corneal edema and subclinical edema could be an interesting addition to the existing available tools for preoperative assessment.
Currently, there is no gold standard to detect corneal edema objectively. Although the clinical evaluation of corneal edema remains inaccurate, corneal thickness remains the most widely used objective parameter to assess corneal edema. Corneal thickness can be deceiving when detecting edema in naturally thin or thick corneas.
10,11 Several studies have addressed the evaluation and detection of corneal edema.
12–14,28–30 We aimed to develop an automated objective tool, rather than a subjective classification. Indeed, several studies focused on diagnosing and grading FECD severity.
12,14 Krachmer et al.
34 and Louttit et al.
35 described a method to grade FECD using the distribution of guttae and presence of edema. Their grading scales include the existence of edema as a parameter of increased FECD severity, but the Krachmer scale states that corneal edema can only be present with extensive guttae.
In our study, we have deliberately chosen to use the DCCT to define edema because it is an objective and quantitative parameter. The clinical assessment of minimal or subclinical corneal edema is subjective and unreliable. Minimal edema is nonetheless certainly an interesting matter when considering a patient for surgery, especially for a triple procedure. Therefore, we wanted an objective metric to evaluate our model's detection performance and to determine the minimal detectable edema threshold with our method.
We compared our model's performance with the Scheimpflug classification described by Sun et al.
14 We observed a perfect agreement between both methods in cases with a DCCT of 26 µm or less and a relatively good agreement in cases with a DCCT over that threshold. Some patients with a DCCT of more than 26 µm exhibited only one feature, whereas the authors suggested that the presence of two features is indicative of subclinical edema. Even though this classification was described for features inside the central 4 mm, we believe it can be used equally in cases falling outside these criteria. In our study, most false-negative cases had visible features outside of the central 4 mm. Interestingly, in their study, the only case reported with guttae and without any tomographic feature had a DCCT of 27 µm, whereas other cases with tomographic features had a DCCT greater than this value. This value is comparable with the 26-µm cut-off we found for significant edema detectable by both techniques.
Recently, Zander et al.
36 developed a model to predict corneal edema resolution after DMEK based on a single Scheimpflug tomographic imaging examination in patients with FECD. They assessed tomographic features and parameters of corneal shape and structure before and after intervention restoring endothelial function. The model was validated on 32 eyes. The ROC curve AUC of 0.97 (95% confidence interval, 0.86–1.00) to separate patients with an edema resolution of less than 50 µm from those with more edema.
More recently, Patel et al.
37 also developed a model to predict corneal improvement after DMEK for FECD by using specific software providing quantitative parameters from Scheimpflug tomography posterior elevation and pachymetry maps that were independent of corneal thickness. The model was evaluated on 45 eyes and R
2 between predicted and observed CCT change was 0.89. It should be noted that, in both studies, no normal cases were included; therefore, the risk of a false positive was not evaluated completely.
Our study has several limits. First, the threshold of DCCT is affected by the Descemet's membrane thickness (DMT) which is thickened in patients with FECD. Indeed, Huang et al.
38 reported that the higher density of guttae is correlated with increased thickness with a mean DMT of 25.5 ± 10.9 µm in corneas with guttata against 16.1 ± 2.4 µm in normal corneas. Thus, it would be relevant to subtract the DMT to the DCCT to obtain an accurate measurement of differential thickness strictly due to edema.
Moreover, because the DMT is higher in the center than the periphery in FECD,
39 it is interesting to note that it could also affect the performance of previously described techniques. Indeed, posterior corneal densitometry, central-to-peripheral thickness ratio, and focal posterior corneal surface depression could all theoretically be falsely positive due to a local central thickening of the Descemet membrane, even without any stromal edema.
By using the absolute DCCT instead of relative thickness variation (defined as DCCT divided by preoperative pachymetry), it is easy to identify patients whose DCCT is comparable with a thickened FECD Descemet membrane. Indeed, patients with FECD with a DCCT of less than 20 µm and negative for edema could be true negatives and this CCT decreased could be explained by the difference of DMT only. Thus, the cut-off of 20 µm is not absolute and should be adjusted from DMT to be certain of the presence of edema.
In addition to the EF metric, our model provides informative images of the regions detected as edema on each OCT image (
Figs. 4 to
7). As with most deep learning models, the explainability of the results is limited. Even though, we cannot know exactly why the model selects those regions as edema, we believe that validation against an objective metric (DCCT) and other imaging techniques (Scheimpflug) helps building trust in the model's results. Moreover, the highlighted regions of the color map encourage the clinician to look closer for subtle signs of edema.
The provided color maps should be interpreted together with the EF results. Indeed, in cases of EF lower than the significant cut-off of 0.143, the colored heat map can highlight features of the stroma or the epithelium, helping the surgeon in their decision. For example, in
Figure 7, the CCT was 542 µm, the DCCT was 44 µm and the mean EF was 0.062, below the optimal threshold. Nonetheless, the model highlights a region of peripheral edema, also visible on the Scheimpflug examination. In this case of very localized edema, the angular spacing of 22.5° of the Avanti radial scans and the process of averaging the EF value over all scans resulted in an EF value lower than the optimal detection threshold.
Some limitations are specific to the current version of the model. Some of the control patients with no edema had high EF values (>0.8), meaning that most of the cornea was detected as edema. Such cases are probably indicative of a global difference in the image signal, which could affect the whole image, independent of the presence of edema. Convolutional neural networks are very sensitive to subtle signal differences, often imperceptible to the human eye. It would be interesting to repeat the OCT scans in these cases and compare the results.
Regarding the comparison with the Scheimpflug classification, despite encouraging results, very few patients had undergone a Scheimpflug tomography before surgery. A further comparative study including more patients would be interesting.
One limitation is related to how CCT is measured with the Avanti OCT. Indeed, there is no image registration to ensure corneal thickness measurement repeatability. Thus, measures performed before and after surgery for the same patient are probably not from the exact same area. But because the CCT value is averaged from the 3-mm central zone, it is probably an acceptable approximation to consider them as the same regions.
Validation of the model should be further conducted prospectively and in other populations and situations. For example, the repeatability and diurnal variation of our model's results,
40 as well as postoperative performances, should be addressed in subsequent studies.
Figure 8 provides a suggested decision tree proposing the clinical use of our model for patients with FECD which could be one of its most common use cases.
When assessing patients with FECD, the most and foremost aspect to consider is the presence of symptoms. Indeed, asymptomatic patients should not undergo any surgery. Nonetheless, using our model could provide additional objective baseline information for comparison during their follow up.
In cases of symptomatic patients without cataract, if the model detects edema, a DMEK surgery could be suggested as endothelial failure is manifest and will only worsen with time. In contrast, in cases of symptomatic patients without cataract and no detected edema, visual discomfort is probably due to the optical consequences of the guttae alone. In this case, depending on the magnitude of the patient's symptoms, a simple follow-up, Descemet stripping only, or DMEK could be discussed. Finally, in cases of symptomatic patients with cataract, the presence or absence of detected edema could help decide between a triple procedure and cataract surgery alone. Indeed, in our cohort, some patients had no detected edema, neither with our model nor with the Scheimpflug classification. It could be argued that cataract surgery alone would have been a good option for those patients.
Finally, it should be noted that, as with any measuring technique or device, its interpretation has some elements of subjectivity. We believe our model is intended to help the clinician in reading the images and should be used jointly to clinical examination, corneal thickness measurement, and topography to help the surgeon in decision making.
Because DMEK surgery has become widely accepted, accurate detection of edema is essential because it reveals endothelial dysfunction. We believe methods of edema detection should be evaluated against an objective measurement rather than a subjective clinical classification. Indeed, a point-by-point differential pachymetry between pre and postoperative measures with registration, and subtraction of the Descemet membrane thickness could provide a robust standard to quantify preoperative edema and allow for a precise evaluation of new screening techniques.
In conclusion, we have developed an automated tool capable of objectively detecting minimal corneal edema in patients before DMEK surgery. Our deep learning approach seems promising and could certainly increase the performance of detection of subclinical edema or be combined with other existing methods. In addition, it can probably also be used in the follow up of DMEK surgeries to assess graft function and corneal edema reduction. This should be verified in subsequent studies. In the future, we can imagine its systematic use by corneal experts before DMEK surgery and other ophthalmologists in the decision-making before cataract surgery.
The authors thank Sarah Moran for her contribution to the review of the prepublication report.
Disclosure: K. Bitton, None; P. Zéboulon, None; W. Ghazal, None; M. Rizk, None; S. Elahi, None; D. Gatinel, None