Translational Vision Science & Technology Cover Image for Volume 14, Issue 2
February 2025
Volume 14, Issue 2
Open Access
Artificial Intelligence  |   February 2025
Deep Learning Approaches to Predict Geographic Atrophy Progression Using Three-Dimensional OCT Imaging
Author Affiliations & Notes
  • Kenta Yoshida
    Clinical Pharmacology, Genentech, Inc., South San Francisco, CA, USA
  • Neha Anegondi
    Clinical Imaging Group, Genentech, Inc., South San Francisco, CA, USA
  • Adam Pely
    gRED Computational Science, Genentech, Inc., South San Francisco, CA, USA
  • Miao Zhang
    gRED Computational Science, Genentech, Inc., South San Francisco, CA, USA
  • Frederic Debraine
    Product Development Ophthalmology, Genentech, Inc., South San Francisco, CA, USA
  • Karthik Ramesh
    Product Development Ophthalmology, Genentech, Inc., South San Francisco, CA, USA
  • Verena Steffen
    Product Development Data Science, Genentech, Inc., South San Francisco, CA, USA
  • Simon S. Gao
    Clinical Imaging Group, Genentech, Inc., South San Francisco, CA, USA
  • Catherine Cukras
    Department of Ophthalmology, Roche Pharma Research and Early Development, F. Hoffmann-La Roche Ltd, Basel, Switzerland
  • Christina Rabe
    Product Development Data Science, Genentech, Inc., South San Francisco, CA, USA
  • Daniela Ferrara
    Product Development Ophthalmology, Genentech, Inc., South San Francisco, CA, USA
  • Richard F. Spaide
    Vitreous Retina Macula Consultants of New York, New York, NY, USA
  • SriniVas R. Sadda
    Doheny Eye Institute, Los Angeles, California; Department of Ophthalmology, David Geffen School of Medicine at University of California, Los Angeles, Los Angeles, CA, USA
  • Frank G. Holz
    Department of Ophthalmology and GRADE Reading Center, University of Bonn, Bonn, Germany
  • Qi Yang
    Product Development Ophthalmology, Genentech, Inc., South San Francisco, CA, USA
  • Correspondence: Qi Yang, Product Development, Analytics and Medical Imaging, Genentech, A Member of the Roche Group, 1 DNA Way, South San Francisco, CA 94080, USA. e-mail: [email protected] 
Translational Vision Science & Technology February 2025, Vol.14, 11. doi:https://doi.org/10.1167/tvst.14.2.11
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Kenta Yoshida, Neha Anegondi, Adam Pely, Miao Zhang, Frederic Debraine, Karthik Ramesh, Verena Steffen, Simon S. Gao, Catherine Cukras, Christina Rabe, Daniela Ferrara, Richard F. Spaide, SriniVas R. Sadda, Frank G. Holz, Qi Yang; Deep Learning Approaches to Predict Geographic Atrophy Progression Using Three-Dimensional OCT Imaging. Trans. Vis. Sci. Tech. 2025;14(2):11. https://doi.org/10.1167/tvst.14.2.11.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: To evaluate the performance of various approaches of processing three-dimensional (3D) optical coherence tomography (OCT) images for deep learning models in predicting area and future growth rate of geographic atrophy (GA) lesions caused by age-related macular degeneration (AMD).

Methods: The study used OCT volumes of GA patients/eyes from the lampalizumab clinical trials (NCT02247479, NCT02247531, NCT02479386); 1219 and 442 study eyes for model development and holdout performance evaluation, respectively. Four approaches were evaluated: (1) en-face intensity maps; (2) SLIVER-net; (3) a 3D convolutional neural network (CNN); and (4) en-face layer thickness and between-layer intensity maps from a segmentation model. The processed OCT images and maps served as input for CNN models to predict baseline GA lesion area size and annualized growth rate.

Results: For the holdout dataset, the Pearson correlation coefficient squared (r2) in the GA growth rate prediction was comparable for all the evaluated approaches (0.33∼0.35). In baseline lesion size prediction, prediction performance was comparable (0.9∼0.91) except for the SLIVER-net (0.83). Prediction performance with only the thickness map of the ellipsoid zone (EZ) or retinal pigment epithelium (RPE) layer individually was inferior to using both. Addition of other layer thickness or intensity maps did not improve the prediction performance.

Conclusions: All explored approaches had comparable performance, which might have reached a plateau to predict GA growth rate. EZ and RPE layers appear to contain the majority of information related to the prediction.

Translational Relevance: Our study provides important insights on the utility of 3D OCT images for GA disease progression predictions.

Introduction
Age-related macular degeneration (AMD) is a leading cause of irreversible blindness among the elderly population across the globe.1 A significant proportion of advanced AMD cases present as geographic atrophy (GA), which leads to progressive and severe loss of central vision, and it constitutes an active area of research focusing on identifying novel treatment strategies. The development of efficacious therapies relies heavily on our understanding of the natural disease progression, particularly in guiding the design, execution, and interpretation of clinical trials. 
Ocular imaging has transformed the clinical management and scientific understanding of retinal diseases, including GA.2 Optical coherence tomography (OCT) has the unique capability of capturing high-resolution cross-sectional images of the retina, allowing for the accurate quantification and evaluation of various retinal layers. This noninvasive imaging technology serves as a surrogate tool for in vivo “histology,” providing valuable insights into the pathogenesis and progression of retinal diseases.3 Recent advancements in machine learning, particularly deep learning (DL), have facilitated effective use of the data derived from ocular images.4,5 
OCT holds particular promise for GA progression prediction owing to its capacity to visualize specific retinal structures affected by the disease, such as drusen, Bruch membrane (BM) changes, hyperreflective foci and retinal pigment epithelium (RPE) alterations, which cannot be fully captured by other imaging modalities like color fundus photography or fundus autofluorescence (FAF). Additionally, OCT's ability to elucidate the microstructural changes in the retina that precede GA diagnosis suggests that it can potentially serve as a predictive image biomarker for early detection and intervention. This underscores the importance of OCT images in creating more effective prognostic models for disease progression. 
Prognostic models, including those utilizing convolutional neural network (CNN) models, aim to predict future outcomes.6,7 Although prognostic models have multiple applications in clinical trials, one notable area of application is for use as covariate adjustment.8,9 Covariate adjustment is a statistical methodology deployed in the analysis phase of clinical trials, highly encouraged by the scientific community and other important stakeholders such as regulatory authorities. This methodology adjusts the outcome variable (e.g., future GA growth) according to the potential influence of one or more baseline (pre-treatment) prognostic variables (covariates), thus enhancing the precision of the estimated treatment effect. In a previous study, we developed DL prognostic models for GA growth using FAF and OCT images, and demonstrated that the model based on FAF-only, OCT-only, and multimodal (FAF and OCT) images reached squared Pearson correlation coefficient (r2) of 0.48, 0.36, 0.47 for GA growth rate prediction on the holdout dataset.10 These performances are notable as r2 of 0.48 corresponds to almost 90% increase in the effective sample size when applied for the covariate adjustment. In addition to the study described here, many studies have demonstrated the utility of retinal images for prognostic modeling.5,11 
The objectives of this study were twofold: (1) to evaluate whether the different approaches to handle OCT images would result in superior prediction performance than the previous study, and (2) to examine which structural changes in retinal layers of the OCT images contain information related to GA disease progression. The main challenges in developing CNN models based on OCT images arise from their intrinsically three-dimensional (3D) nature. In the previous study, we converted OCT scans into three en face images, corresponding to the average intensity for the full depth, above BM, and below BM, and treated the concatenated images as three-channel images for two-dimensional (2D) CNN models.10 In this study, we have evaluated four approaches for processing 3D OCT images for deep learning models: (1) enface intensity maps as evaluated previously; (2) SLIVER-net, a novel approach that processes the volume scans into tiled 2D images, then adding “layers” to capture the 3D structure12; (3) a 3D CNN approach in which 3D OCT images are directly used without conversion into 2D13; (4) en face feature maps corresponding to the thickness and intensities between different layers of the OCT scan generated by the internally developed segmentation model EyeNotate (Pely A, et al. IOVS 2024;65:ARVO E-Abstract 2344). The 3D CNN approach was selected because it uses the entire information of OCT scans without losing spatial information. However, it involves a large number of parameters even for a relatively simple architecture like DenseNet121 (which we used in the study) and may require larger data volume to allow for sufficient training, especially considering general unavailability of pretrained weights. We therefore also evaluated the SLIVER-net approach to still maintain some level of spatial information while reducing the numbers of parameters. In a similar manner, we evaluated the modeling based on the retinal layer segmentation outputs from EyeNotate, with the assumption that these segmented layers contain key parts of the information relevant for GA progression that increase the efficiency of prediction models. We hypothesized that comparing these various approaches could lead us to a more effective methodology for the accurate prediction of GA progression, and help elucidate the aspects of the 3D volumetric data that inform GA progression prediction. 
Methods
Data Source
Data from three prospective clinical trials were pooled and used for the model developments where patients with bilateral GA were enrolled; two phase 3 lampalizumab studies (Chroma [NCT02247479]; Spectri [NCT02247531]) 14 in which no treatment effect was observed and an observational study (Proxima A [NCT02479386]).15 The details of the study eligibility criteria have been previously described and, briefly, included the diagnosis of bilateral GA with no choroidal neovascularization secondary to AMD in either eye.14,15 
For the current study, patients with baseline macular OCT volumes captured using the Spectralis HRA+OCT (Heidelberg Engineering, Heidelberg, Germany) with a 20° × 20° scan, 1024 A-scans per B-scan, 496 pixels per A-scan and 49 B-scans per volume for the study eye were included. Additionally, only the patients/eyes with successful segmentation of OCT volumes as described below were included. Details of the derivations of GA lesion areas, as well as GA growth rate, have been described previously.10 Briefly, GA lesion areas were manually graded using a semi-automated annotation tool (RegionFinder software; Heidelberg Engineering) on FAF images by experts from a reading center. The GA growth rate (mm2/year) was defined as the annualized slope of a linear model fitted using all available FAF measurements for each eye.10 The physical dimensions of the FAF and OCT images were approximately 8.7 mm × 8.7 mm and 6 mm × 6 mm, respectively. In total, 1219 and 442 patients/eyes were included as development and holdout dataset, respectively, after the random splitting of the data at the patient level. For a sensitivity analysis, 127 patients who had GA lesions extending beyond the OCT field of view were manually excluded. 
Model Architectures
Four distinct approaches of OCT-based CNN models were evaluated in this study (Fig. 1). 
  • 1. The first approach is the same as reported previously,10 evaluated here for the purpose of comparison. Briefly, en-face maps of the intensity for full-depth, above BM, and below BM were combined as a three-channel input to the 2D CNN. InceptionV316 was used as the backbone architecture.
  • 2. As the second approach, we applied the recently proposed SLIVER-net approach,12 wherein the 3D OCT volume scans were converted into a series of 2D tiled images. After the tiled images were subjected to regular 2D CNN, the output vector was further aggregated by 1D CNN that respects the spatial relationships between tiled images before being subjected to the prediction tasks. Resnet1817 was used as the backbone architecture.
  • 3. As the third approach, we used the 3D CNN approach.13 Unlike the tiling method used above, the CNN here was fed the original 3D OCT images without prior conversion to 2D, thus directly processing the 3D data. 3D DenseNet121 was used as the backbone architecture.
  • 4. In the fourth approach, en-face feature maps of various retinal layers obtained from OCT segmentation algorithms EyeNotate were used as input to 2D CNN models (Pely A, et al. IOVS 2024;65:ARVO E-Abstract 2344). Briefly, EyeNotate with DeepLabv3+ architecture was trained on 6718 annotated OCT B-scans from participants involved in intermediate AMD and GA trials (NCT01790802 and NCT02399072; n = 189), resulting in the segmentation of retinal layers in B-scan image including ELM, EZ, and RPE layers. Using the segmentation output, thickness and mean intensity of each layer in en-face view were generated as input for the CNN model. For the CNN model, InceptionV316 was used as the backbone architecture.
Figure 1.
 
Overview of the preprocessing steps of OCT images and CNN model architectures for the four approaches evaluated in the study.
Figure 1.
 
Overview of the preprocessing steps of OCT images and CNN model architectures for the four approaches evaluated in the study.
For all of the modeling approaches, pretrained weight with ImageNet18 was used as the starting point of the training for the backbone architectures. During the model optimization, both GA lesion size at baseline and future GA growth rate were simultaneously included in the loss function. The feature vectors from the respective CNNs were followed by dropout and dense layers (256 hidden units), then the final fully-connected layers for the prediction. The weights on the all layers (both backbone and additional layers) were trained. The numbers of trainable parameters were 22,817,954, 11,317,285, 11,811,842, and 22,817,666 for the approaches 1, 2, 3, and 4 (with EZ & RPE layer thickness maps as input), respectively. 
Image Pre-Processing
The original 3D OCT volumes (496 × 1024 × 49 voxels) were subjected to preprocessing before being fed into the above CNN models (Fig. 1). For the approaches 1 to 3, each B-scan was first flattened along the Bruch membrane (BM) (A-scan was aligned based on the location of the BM), resulting in 512 × 512 × 49 voxels, as described previously.10 For the en-face intensity approach (approach 1), en-face maps of the intensity for full-depth, 100 pixels above and below BM (approximately 390 µm) were combined into 3-channel images. For SLIVER-net and 3D CNN, each B-scan was further cropped into 256 × 512 around the BM, then re-sized to 224 × 224 images before being tiled into 224 × 10976 2D image (SLIVER-net) or resized to 224 × 224 × 49 voxels (3D CNN). For the fourth approach, the layer thickness and/or between-layer intensity maps for the layers of interest, obtained from EyeNotate (Pely A, et al. IOVS 2024;65:ARVO E-Abstract 2344), were combined and were fed as multichannel images to the 2D CNN with the size of 512 × 512. 
Online image augmentations were performed. For the approaches 1 and 4, the following augmentations were applied: horizontal flip, rotation (0.01 × 2π), contrast (±0.2), and brightness (±0.2). For the approach 2 (SLIVER-net), horizontal flip, contrast (±0.2), brightness (±0.2), gaussian noise (σ = 0.05) were applied. For the approach 3 (3D CNN), horizontal flip on the B-scan dimension, brightness and contrast (±0.2), and gaussian noise (σ2 = 0.001) were applied. 
Model Optimization and Evaluation
The model optimization and performance evaluation were achieved in a simplified nested fivefold cross-validation with hyperparameter optimizations. First, five outer folds were generated with random split of the development datasets. For each outer training data, another random split was done at the ratio of 3:1 (training/validation). Thus generated inner training and inner validation dataset is used for the hyperparameter optimization to select the hyperparameter that maximizes the inner validation performance. The model was then re-trained using the optimized hyperparameters with the full outer training dataset (Inner training + Inner validation dataset) corresponding to each outer fold, and outer validation performance was calculated for each outer fold. Finally, the outer validation performance for all the five outer folds were summarized. 
The following hyperparameters were optimized with 20 iterations of tree-structured Parzen estimator algorithm: batch size (4, 8, 16), dropout rate on the feature vector before the last dense layer (0∼0.9), learning rate (1E-5∼1E-3), loss function (mean absolute error or mean squared error), relative weight of loss for the baseline GA lesion size against the GA growth rate (0.1∼1). For the SLIVER-net and 3D CNN, batch size of only four and eight were evaluated due to the GPU memory constraint. The Pearson correlation coefficient (r) of GA growth rate for the inner validation dataset was used as the objective function for hyperparameter optimization. 
For the holdout prediction performance evaluation, the model was re-trained using the five sets of optimized hyperparameters using the entire development datasets. Five predictions were made for the holdout dataset with these retrained models, and the mean of five predictions were calculated as an ensemble prediction, which was then used for calculation of the performance metrics. 
For method 4, only the EZ and RPE layer thickness were used as input when performing the above described hyperparameter optimization and holdout performance evaluation. In addition, an exploratory analysis was performed by using various combinations of OCT segmentation outputs. For this purpose, only a cross-validation evaluation with fixed hyperparameters (batch size of 16, dropout rate of 0.1, learning rate of 1E-4, mean absolute error as the loss function, relative weight of 1) was performed. 
Software and Computation Environments
All models were implemented with Tensorflow 2.9.3. The models and pre-trained weights (ImageNet) were imported with classification_models and classification_models_3D libraries. Hyperparameter optimization was performed using Optuna 3.4.0. Programs were run in Python 3.8.16 using NVIDIA T4 Tensor Core GPU on a cloud-based computing cluster. 
Results
The prediction performance was evaluated with nested cross-validation and on the holdout dataset as described in the “Model optimization and evaluation” section in Method. The performance for the four approaches of interest were summarized in the TableFigure 2, and Supplementary Table S1. For baseline (concurrent) GA lesion area prediction, all approaches, except for SLIVER-net, demonstrated comparable good performance in both cross-validation and holdout datasets holdout performance (r2) were 0.91, 0.83, 0.90, 0.90 for en-face intensity map, SLIVER-net, 3D DenseNet, and OCT EZ and RPE thickness map, respectively. For the GA growth rate prediction, the 3D CNN approach showed lower cross-validation performance compared to the other three approaches. Nevertheless, the holdout performance was similar across all four evaluated approaches; holdout performance (r2) were 0.33, 0.33, 0.35, 0.35 for the four approaches, respectively. 
Table.
 
Prediction Performance of the Four Approaches on the Development Dataset (Cross-Validation) and Holdout Dataset as Evaluated by Pearson's Correlation Coefficient Squared
Table.
 
Prediction Performance of the Four Approaches on the Development Dataset (Cross-Validation) and Holdout Dataset as Evaluated by Pearson's Correlation Coefficient Squared
Figure 2.
 
Prediction performance of the four approaches on (a) the development dataset (cross-validation) and (b) holdout dataset. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Holdout performance is given as r2 (± 95% CI, black circles and error bars). Thickness maps for the segmentation outputs were derived as distance between iEZ-iRPE (EZ) and iRPE-oRPE (RPE), where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
Figure 2.
 
Prediction performance of the four approaches on (a) the development dataset (cross-validation) and (b) holdout dataset. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Holdout performance is given as r2 (± 95% CI, black circles and error bars). Thickness maps for the segmentation outputs were derived as distance between iEZ-iRPE (EZ) and iRPE-oRPE (RPE), where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
To better understand which retinal layers contain information related to GA lesion area and its future growth, various combinations of segmented retinal layer thickness and intensity maps were evaluated for their predictive performance (Fig. 3). The RPE layer alone exhibited almost the same performance as the combination of the EZ and RPE layers in predicting the baseline GA area, whereas EZ showed much lower performance, suggesting that RPE layer attenuation contains the majority of the information for the current GA lesion area. When predicting GA growth, models using only the EZ or RPE layer thickness performed less effectively than those using both EZ and RPE layers combined, indicating the critical role these two layers play in such prediction. Additionally, we explored the integration of other combinations of en-face OCT segmentation outputs for layer thickness and intensity, such as external limiting membrane (ELM; iELM-iEZ in Fig. 3), drusen (sub-RPE drusen only), and choroid (total choroidal intensity), into the prediction model. However, including these additional OCT segmentation outputs did not enhance the cross-validation performance in predicting the GA growth rate. 
Figure 3.
 
Prediction performance with various combinations of OCT segmentation layer maps. The table on the left indicates which enface feature maps were included as input to 2D CNN. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Thickness and intensity maps of the segmentation outputs were derived as distance or mean intensity between layers, where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
Figure 3.
 
Prediction performance with various combinations of OCT segmentation layer maps. The table on the left indicates which enface feature maps were included as input to 2D CNN. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Thickness and intensity maps of the segmentation outputs were derived as distance or mean intensity between layers, where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
Discussion
The purpose of this study was to examine the performance of different approaches for processing 3D OCT images to predict the future growth rate of GA lesion areas. Our results indicated that the four OCT-based models (SLIVER-net, 3D CNN, en-face intensity map, and segmentation-based approach) performed equivalently well, achieving similar r2 values on the holdout dataset. 
One important implication of our results is that, in terms of OCT image-based predictions for GA growth rate, we may have reached a plateau using the current available development data. Despite the implementation of novel and complex 3D OCT image processing models, we did not observe superior performance compared to the previous approach (en-face intensity map). This suggests a level of saturation in the performance that can be obtained using the available image data by the way we analyzed the data. This insight is crucial as it indicates that the development of more sophisticated approaches for processing OCT images may not necessarily lead to significant performance improvements in predicting the future growth rate of GA lesion areas. A recent study on another alternative approach for predicting GA progression supports this hypothesis; while those authors employed a more sophisticated approach to predict the future GA lesion map directly, the numerical performance in the area prediction fell into a similar range (R2 = 0.37).19 Another hypothesis that might explain the challenges of further improving these GA progression predictions is related to OCT image resolution in the datasets that were used for our study or other studies in the literature; it is still possible that the relatively low B-scan density of the OCT volumes (49 B-scans for macular cube in our study) and the amount of the 3D image training data to capture useful information for a DL model might be limiting factors in further improving the prediction performance. More recent clinical studies tend to have denser scan protocol settings, such as 97 B-scan Spectralis OCT volumes or 256 B-scan Cirrus OCT volumes, where we might be able to have additional insights. In the current study, we have used relatively mature deep learning architecture (CNN) as backbone models. Evaluation of more recent approaches such as self-supervised learning or foundation models could be a potential area for future investigation.20 Nevertheless, the convergence of multiple independent methodologies to a similar level of predictive accuracy underscores the robustness of OCT images as a critical tool in estimating the progression of GA. 
The value of different segmentation outputs as predictors of GA growth was also examined. When we evaluated the influence of EZ and RPE layer thicknesses on prediction performances, models incorporating both EZ and RPE layer thicknesses showed superior performance when compared to models using either EZ or RPE thickness alone. This observation is in alignment with the mechanistic understanding where RPE layer losses correspond to the current lesion of GA whereas the EZ layer losses correspond to the areas where a future lesion growth is anticipated.2123 
We also explored whether the inclusion of additional retinal segmentation output information can further enhance the GA progression prediction performance. Specifically, we explored various combinations involving layer thickness and layer intensity maps of the ELM, drusen, and choroid. Outer nuclear layer was previously shown to be predictive of GA progression24; however, it was not included in this analysis because of the technical challenge of accurately quantifying the outer nuclear layer thickness in and around the GA lesion where ELM is absent. As shown in Figure 3, adding these additional layer maps to models already utilizing EZ and RPE layer thickness maps showed no noticeable improvements in cross-validation performance for predicting GA growth rate. This suggests that the EZ and RPE layers already contain the majority of the relevant information for the prediction performance observed in these OCT-based prediction models. 
It is important to note that the GA growth rate prediction performance observed in this study based on the OCT images was lower than what we have previously observed with FAF image-based approaches (r2 = 0.48 on the holdout dataset compared to 0.33∼0.35 observed in this study) which illustrates some of the key limitations of the approaches used in this study.10 We performed a sensitivity analysis by manually excluding the eyes with GA lesion extending beyond the OCT field of view, however the performance of OCT-based approaches was not improved (data not shown). One plausible explanation for the observed performances could be the inherent differences in the imaging data captured by FAF and OCT and subsequent interpretation by the models, as discussed in our previous publication, in addition to the relatively low OCT density as discussed above.10 It is worth mentioning that the GA area at baseline and in follow-up visits over time were originally derived from the manual expert gradings of FAF images. Therefore the FAF-based approach has an inherent advantage of using the same source of imaging information for the prediction tasks. Nevertheless, we believe it is important to evaluate the performance on FAF-derived metrics as it is the gold standard endpoint in clinical trials in GA. Strengths of this study include the overall high-quality datasets and expert gradings, collected in a controlled clinical trial, and our comprehensive modeling approaches that enabled a systematic comparison. 
There are several limitations in this study other than those already discussed above. First, we did not evaluate prediction performance on an independent dataset, which may limit the generalizability of our findings. Second, the performance of our model may have been impacted by the downsampling of the OCT images for SLIVER-Net and 3D CNN (from 1024 B-scans to 244 pixel width), which was necessary because of the image size and GPU memory constraints. Additionally, although we explored several 3D modeling approaches, other architectures, such as those involving Vision Transformer models, or a combination of OCT and FAF data in multimodal models, might be better suited for GA images and could lead to improved prediction performance. 
In conclusion, our study offers significant insights into the use of 3D OCT images in prognostic modeling of GA area and progression. The conceptually and technically distinct OCT-based models, including SLIVER-net, 3D CNN, and segmentation-based strategies, demonstrated comparable prediction performances, suggesting a possible performance plateau with the current data management and analysis methods. Furthermore, our evaluation of various segmentation outputs indicated that the EZ and RPE layers contain most of the critical information for GA lesion growth based on OCT images. These insights can aid in scientific progress from two different perspectives. From a clinical science perspective, this work can lead to insights that inform key elements of clinical trial design, such as patient selection and endpoint analysis. From a data science perspective, it can contribute to our understanding of data preparation and modeling approaches for algorithm development. Future research is needed in evaluating the potential utility of more dense OCT scans or new retinal imaging technologies, and advanced modeling approaches such as the use of foundation models. 
Acknowledgments
Supported by Genentech/Roche. 
The figures were created with Biorender.com. 
Disclosure: K. Yoshida, Genentech/Roche (E, I); N. Anegondi, Genentech/Roche (E, I), “Multimodal prediction of geographic atrophy growth rate” (P), “Multimodal geographic atrophy lesion segmentation” (P); A. Pely, Genentech/Roche (C); M. Zhang, Genentech/Roche (E, I); F. Debraine, Genentech/Roche (E); K. Ramesh, Genentech/Roche (E); V. Steffen, Genentech/Roche (E, I); S.S. Gao, Genentech/Roche (E, I), “Multimodal prediction of geographic atrophy growth rate” (P); C. Cukras, Genentech/Roche (E, I); C. Rabe, Genentech/Roche (E, I); D. Ferrara, Genentech/Roche (E, I); R.F. Spaide, Roche (C), Regeneron (C), Heidelberg (C), Topcon (C), Bayer (C), Topcon Medical Systems (R); S.R. Sadda, Allergan/AbbVie (C), Apelis (C), Alnylam (C), Amgen (C), Iveric Bio (C), Roche/Genentech (C), Novartis (C), Nanoscope (C), CharacterBio (C), Neurotech (C), NotalVision (C), Eyepoint (C), OcularTx (C), Alkeus (C), 4DMT (C), Oxurion (C), Optos (C), Heidelberg Engineering (C, R, F), iCare (C, F), Novartis (R), Optos (R, F), Carl Zeiss Meditec (R, F), Nidek (R, F), Regeneron (S), RegenxBio (S), Topcon (F); F.G. Holz, Genentech Inc. (F), Bayer (F), Heidelberg Engineering (F, C, R), Zeiss (F, C, R), Optos (F), Apellis (F, C, R), IvericBio/Astellas (F), Novartis (F, C, R), Pixium Vision (F), Bayer (C), Genentech Inc. (C), IvericBio/Astellas (C), Pixium Vision (C), Oxurion (C), Jansen (C); Q. Yang, Genentech/Roche (E, I), “Multimodal prediction of geographic atrophy growth rate” (P) 
References
Holz FG, Strauss EC, Schmitz-Valckenberg S, van Campagne ML. Geographic atrophy clinical features and potential therapeutic approaches. Ophthalmology. 2014; 121: 1079–1091. [CrossRef] [PubMed]
Schmidt-Erfurth U, Klimscha S, Waldstein SM, Bogunović H. A view of the current and future role of optical coherence tomography in the management of age-related macular degeneration. Eye. 2017; 31: 26–44. [CrossRef] [PubMed]
Chen L, Messinger JD, Sloan KR, et al. Nonexudative macular neovascularization supporting outer retina in age-related macular degeneration a clinicopathologic correlation. Ophthalmology. 2020; 127: 931–947. [CrossRef] [PubMed]
Daich Varela M, Sen S, De Guimaraes TA, et al. Artificial intelligence in retinal disease: clinical application, challenges, and future directions. Graefes Arch Clin Exp Ophthalmol. 2023; 261: 3283–3297. [CrossRef] [PubMed]
Pfau M, Künzel SH, Pfau K, Schmitz‐Valckenberg S, Fleckenstein M, Holz FG. Multimodal imaging and deep learning in geographic atrophy secondary to age-related macular degeneration. Acta Ophthalmol. 2023; 101: 881–890. [CrossRef] [PubMed]
Fascia M . Machine learning applications in medical prognostics: a comprehensive review. arXiv. 2024, doi:10.48550/arxiv.2408.02344.
Steyerberg EW Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating | SpringerLink. Available at: https://rd.springer.com/book/10.1007/978-3-030-16399-0, Accessed November 15, 2024.
USFDA Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products | FDA. Available at: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/adjusting-covariates-randomized-clinical-trials-drugs-and-biological-products, Accessed March 18, 2024.
Colantuoni E, Rosenblum M. Leveraging prognostic baseline variables to gain precision in randomized trials. Stat Med. 2015; 34: 2602–2617. [CrossRef] [PubMed]
Anegondi N, Gao SS, Steffen V, et al. Deep learning to predict geographic atrophy area and growth rate from multimodal imaging. Ophthalmol Retin. 2023; 7: 243–252. [CrossRef]
Enzendorfer ML, Schmidt-Erfurth U. Artificial intelligence for geographic atrophy: pearls and pitfalls. Curr Opin Ophthalmol. 2024; 35: 455–462. [CrossRef] [PubMed]
Rakocz N, Chiang JN, Nittala MG, et al. Automated identification of clinical features from sparsely annotated 3-dimensional medical imaging. NPJ Digit Med. 2021; 4: 44. [CrossRef] [PubMed]
Solovyev R, Kalinin AA, Gabruseva T. 3D convolutional neural networks for stalled brain capillary detection. Comput Biol Med. 2022; 141: 105089. [CrossRef] [PubMed]
Holz FG, Sadda SR, Busbee B, et al. Efficacy and safety of lampalizumab for geographic atrophy due to age-related macular degeneration: chroma and spectri phase 3 randomized clinical trials. JAMA Ophthalmol. 2018; 136: 666. [CrossRef] [PubMed]
Holekamp N, Wykoff CC, Schmitz-Valckenberg S, et al. Natural history of geographic atrophy secondary to age-related macular degeneration results from the Prospective Proxima A and B Clinical Trials. Ophthalmology. 2020; 127: 769–783. [CrossRef] [PubMed]
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. Available at: https://arxiv.org/abs/1512.00567. Accessed March 18, 2024
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770–778.
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. ImageNet: a large-scale hierarchical image database. 2009 IEEE Conf. Comput. Vis. Pattern Recognit. 2009;248–255.
Mai J, Lachinov D, Reiter GS, et al. Deep learning-based prediction of individual geographic atrophy progression from a single baseline OCT. Ophthalmol Sci. 2024; 4: 100466. [CrossRef] [PubMed]
Chia MA, Antaki F, Zhou Y, Turner AW, Lee AY, Keane PA. Foundation models in ophthalmology. Br J Ophthalmol. 2024; 108: 1341–1348. [CrossRef] [PubMed]
Qu J, Velaga SB, Hariri AH, Nittala MG, Sadda S. Classification and quantitative analysis of geographic atrophy junctional zone using spectral domain optical coherence tomography. Retina. 2018; 38: 1456–1463. [CrossRef] [PubMed]
Iliescu DA, Ghita AC, Ilie LA, Voiculescu SE, Geamanu A, Ghita AM. Non-neovascular age-related macular degeneration assessment: focus on optical coherence tomography biomarkers. Diagnostics. 2024; 14: 764. [CrossRef]
Reiter GS, Told R, Schranz M, et al. Subretinal drusenoid deposits and photoreceptor loss detecting global and local progression of geographic atrophy by SD-OCT imaging. Invest Ophthalmol Vis Sci. 2020; 61: 11. [CrossRef] [PubMed]
Pfau M, von der Emde L, de Sisternes L, et al. Progression of photoreceptor degeneration in geographic atrophy secondary to age-related macular degeneration. JAMA Ophthalmol. 2020; 138: 1026–1034. [CrossRef] [PubMed]
Figure 1.
 
Overview of the preprocessing steps of OCT images and CNN model architectures for the four approaches evaluated in the study.
Figure 1.
 
Overview of the preprocessing steps of OCT images and CNN model architectures for the four approaches evaluated in the study.
Figure 2.
 
Prediction performance of the four approaches on (a) the development dataset (cross-validation) and (b) holdout dataset. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Holdout performance is given as r2 (± 95% CI, black circles and error bars). Thickness maps for the segmentation outputs were derived as distance between iEZ-iRPE (EZ) and iRPE-oRPE (RPE), where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
Figure 2.
 
Prediction performance of the four approaches on (a) the development dataset (cross-validation) and (b) holdout dataset. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Holdout performance is given as r2 (± 95% CI, black circles and error bars). Thickness maps for the segmentation outputs were derived as distance between iEZ-iRPE (EZ) and iRPE-oRPE (RPE), where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
Figure 3.
 
Prediction performance with various combinations of OCT segmentation layer maps. The table on the left indicates which enface feature maps were included as input to 2D CNN. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Thickness and intensity maps of the segmentation outputs were derived as distance or mean intensity between layers, where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
Figure 3.
 
Prediction performance with various combinations of OCT segmentation layer maps. The table on the left indicates which enface feature maps were included as input to 2D CNN. Cross-validation performance is given as the mean ± SD (black circles and error bars), as well as individual cross-validation fold performance (colored circles), for the square of Pearson correlation coefficient (r2). Thickness and intensity maps of the segmentation outputs were derived as distance or mean intensity between layers, where “i” refers to the inner side of and “o” refers to the outer side of the corresponding layers.
Table.
 
Prediction Performance of the Four Approaches on the Development Dataset (Cross-Validation) and Holdout Dataset as Evaluated by Pearson's Correlation Coefficient Squared
Table.
 
Prediction Performance of the Four Approaches on the Development Dataset (Cross-Validation) and Holdout Dataset as Evaluated by Pearson's Correlation Coefficient Squared
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×