Deep Learning and Machine Learning Algorithms for Retinal Image Analysis in Neurodegenerative Disease: Systematic Review of Datasets and Models

Purpose Retinal images contain rich biomarker information for neurodegenerative disease. Recently, deep learning models have been used for automated neurodegenerative disease diagnosis and risk prediction using retinal images with good results. Methods In this review, we systematically report studies with datasets of retinal images from patients with neurodegenerative diseases, including Alzheimer's disease, Huntington's disease, Parkinson's disease, amyotrophic lateral sclerosis, and others. We also review and characterize the models in the current literature which have been used for classification, regression, or segmentation problems using retinal images in patients with neurodegenerative diseases. Results Our review found several existing datasets and models with various imaging modalities primarily in patients with Alzheimer's disease, with most datasets on the order of tens to a few hundred images. We found limited data available for the other neurodegenerative diseases. Although cross-sectional imaging data for Alzheimer's disease is becoming more abundant, datasets with longitudinal imaging of any disease are lacking. Conclusions The use of bilateral and multimodal imaging together with metadata seems to improve model performance, thus multimodal bilateral image datasets with patient metadata are needed. We identified several deep learning tools that have been useful in this context including feature extraction algorithms specifically for retinal images, retinal image preprocessing techniques, transfer learning, feature fusion, and attention mapping. Importantly, we also consider the limitations common to these models in real-world clinical applications. Translational Relevance This systematic review evaluates the deep learning models and retinal features relevant in the evaluation of retinal images of patients with neurodegenerative disease.


Introduction
The retina is the only neural tissue in the human body that can be directly visualized noninvasively.Findings from retinal imaging can be informative regarding the health of the brain; many abnormalities in retinal imaging have been linked with cerebral pathology. 1 the past decade, image analysis has been revolutionized by convolutional neural networks and other promising emerging technologies for automated image analysis.These technologies undergoing further development may possibly augment the diagnostic capabilities of several ophthalmologic imaging modalities.Machine learning algorithms may assist us in detecting information in retinal images that may not be readily apparent without computational algorithms. 2 Several groups have been using machine learning algorithms to determine if systemic patient health information can be gleaned from retinal images.Such algorithms have demonstrated good accuracy at predicting quantitative variables, such as coronary artery calcium or serum creatinine, and also qualitative variables, such as smoking status or biological sex using retinal fundus images alone. 3It is difficult to identify which features in the retinal images are used by the machine learning algorithms to glean this information.However, there may be more information contained in retinal images than was previously known, and it may be necessary to apply computational algorithms, such as machine learning to establish this. 4 The retina is known to exhibit many of the classic pathologic features of neurodegenerative disease.6][7][8] In patients with Parkinson's disease (PD), pathology studies have found lower levels of dopamine in the retina. 9,10The neurodegeneration that accompanies Huntington's disease (HD) and amyotrophic lateral sclerosis (ALS) may appear in the retina as well. 11,12Case-control comparisons suggest that patients with mild cognitive impairment (MCI) or unspecified dementia (D-US) also exhibit retinal thinning. 13,14iven the known correlation between retinal health and neurodegenerative disease, there is good potential that deep learning algorithms might be able to ascertain information regarding cerebral disease from retinal images. 15Indeed, a growing amount of literature has documented correlations between the progression of neurodegenerative disease and physician-observable retinal findings, such as retinal arteriolar and venular caliber, vessel tortuosity, retinal layer thickness, and optic disc morphology. 16Future research will likely focus on determining what information is contained within optical coherence tomography (OCT), OCTangiography (OCT-A), and color fundus images. 15uch studies will also need to consider what information is not able to be obtained from retinal imaging.
OCT, OCT-A, and fundus imaging allow for detailed quantitative and qualitative analysis of retinal features.OCT uses the reflectivity of light to microimage the anatomy of the retina and optic disk.The peripapillary retinal nerve fiber layer (pRNFL) and macular ganglion cell layer and inner plexiform layer (mGCIPL) are especially implicated in neurodegenerative states, whereas other markers, such as macular volume and choroidal thickness, have also been studied.OCT-A works by comparing retinal layers across time as blood flows through the capillaries.OCT-A captures information regarding retinal vasculature, including microvascular density, branching complexity, and flow density.Fundus imaging allows for the direct visualization of the macula, optic disk, and retinal vasculature.Vessel tortuosity and branching complexity have been identified as helpful biomarkers and other retinal features can be directly visualized through the use of fluorescence imaging.Each imaging modality provides a host of information that has revealed retinal manifestations of neurodegenerative disease.
Presently, it seems that current research using retinal imaging has only scratched the surface of the information which deep learning algorithms might provide, but there are also significant limitations that have yet to be addressed.Many non-neurologic diseases can have retinal manifestations indistinguishable from the features reportedly used by current machine learning models to distinguish between healthy eyes and ones with neurodegenerative disease.It will be important for future machine learning models to use more diverse datasets, based upon longitudinal data to evaluate whether machine learning models can identify specific features that differentiate true neurodegenerative disease from other diseases with neuroretinal implications.

Methods
We conducted a systematic literature review utilizing two searching tools to identify datasets; Google Scholar and Ovid MEDLINE.Our search criteria included studies between January 1, 2012, and February 15, 2023, that contain image datasets with identifiable neurodegenerative disease and/ or studies that utilized deep learning analysis models.
Ovid MEDLINE was used for refined searches, utilizing multi-level Boolean operators (and, or) and specific terminology (exp -explode, .mp-multipurpose) as described below.As the final compilation of search parameters, step 11 represents final search protocol.The conceptual function of these parameters was to identify all articles with ophthalmic imaging in the context of neurodegenerative disease (steps 1-7), which also reported use of a dataset, database, or image analysis algorithm (steps 8-10).The conceptual search design is displayed in Figure 1.Google search engine and Google Scholar were used for broad searches.The searched key terms were as follows: "retinal photography," "neurodegenerative disease OCT," "neurodegenerative disease fundoscopy," "Parkinson's retina," "neurodegenerative deep learning," and "neurodegenerative image dataset." Our OVID broad search resulted in 245 studies and Google resulted in 10 studies.Additionally, we referenced prior meta-analysis studies from Chrysou et al., 17 Zhou et al., 18 Jin et al., 19 Chan et al., 20 Noah et al., 13 Khan et al., 21 Nepal et al., 12 and Katsimpris et al. 22 Following our inclusion criteria, each result and dataset were independently reviewed and recorded.In the setting of incomplete information within the article, datasets were assumed to be available upon request (AoR), containing two eyes per case and one image per eye.
For the review of deep learning models, articles identified in the primary search described above were further screened according to whether they involved predictive models using multivariate data obtained from imaging, predictive models using raw images as input, or predictive models using data extracted from image feature detectors.Articles outside of this scope were excluded from the review of models presented in Section 4: "Strategies used for retinal image analysis in patients with neurodegenerative disease."

Dataset Summary
In total, our search yielded 154 datasets containing approximately 70,481 images of 25,053 patients with neurodegenerative disease and 10,115 healthy controls.A summary of each dataset type can be found in Table 1.See Supplementary Table S1 for the comprehensive list of datasets.AD was the most represented disease (47% of datasets, 76% of patients, 66% of controls, and 74% of images), followed by PD (34% of datasets, 10% of patients, 26% of controls, and 14% of images) and MCI (20% of datasets, 4.1% of patients, 15% of controls, and 7.0% of images), whereas D-US was the least represented (2.6% of datasets, 7.2% of patients, 0.7% of controls, and 5.6% of images).No fundus image datasets were found for PD, ALS, and HD.Two datasets were accessible within the article, one was open-access, one required an account, one was unfinished, and the rest (n = 149) were classified as AoR.Most datasets utilized Heidelberg (54/154) and Zeiss (56/154) devices, whereas nine other manufacturers were also represented.The majority of datasets were from publications authored in Europe (66/154), whereas Asia (35/154), the Middle East (25/154), and North America (24/154) were also well represented, and a few datasets were generated in Oceania (2/154) and South America (3/154).

Retinal Findings Common Among One or More Neurodegenerative Diseases
OCT studies have revealed similarities and differences in the retinas of patients with various neurodegenerative diseases.Table 2 contains a summary of the retinal findings organized by disease.Due to variation in results between individual studies, meta-analyses were referenced when available.In patients with AD, PD, MCI, and D-US, thinning of the pRNFL has been seen in all four quadrants, whereas the temporal and superior quadrants have been the only affected quadrants in patients with HD.Reduced macular volume, mGCIPL loss have been demonstrated in patients with AD, PD, and MCI, but not ALS or HD.Macular thinning has been identified in patients with HD, AD, PD, and MCI, but not ALS or D-US.Although the inner nuclear layer appears to be spared in patients with PD, it is reduced in patients with ALS.Choroidal thinning has been found in patients with AD, HD, and MCI, but not PD.As discussed later, these and other retinal biomarkers have been shown to correlate with disease severity and duration.
OCT-A reveals additional information regarding the retinal vasculature of patients with neurodegenerative disease.Decreased microvascular density and an enlarged foveal avascular zone have been demonstrated in patients with AD and PD.Reduced branching complexity has also been associated with AD.Preliminary OCT-A case-control studies of patients with ALS, HD, or MCI have not found conclusive significant retinal findings.Fundus imaging case-control studies have been limited to only AD.

Alzheimer's Disease
OCT studies have revealed significant retinal neurodegeneration in patients with AD.A 2018 metaanalysis 20 found thinning in the pRNFL, mGCIPL, ganglion cell complex, and choroidal layers, as well as reduced overall macular volume and macular thinning in the inner and outer sectors.The mGCIPL thinning has also been shown to correlate with disease severity. 14arious fluorescent fundus imaging modalities have been used to visualize and quantify retinal AD pathology.Intravenous administration of curcumin, a betaamyloid-binding fluorophore, revealed that the retinal beta-amyloid burden is doubled in patients with AD.Retinal beta-amyloid levels were linked to cortical beta-amyloid burden and reduced hippocampal volume. 23,24Alternatively, blue autofluorescence has been used to quantify the surface area of retinal inclusion bodies which correlates with preclinical cortical beta-amyloid burden. 25Finally, analysis of fluorescence lifetime imaging ophthalmoscopy revealed differences in patients with phakic AD compared to matched controls. 26hanges in retinal vasculature have been identified in fundus images of patients with AD.Fractal dimension (FD), a quantitative representation of vascular branching complexity, can be determined by commercially available software or expert analysis.A systematic § In multiple HD and D-US case-control studies, the same controls were used for comparison against other disease types (AD, PD, and HD) as well.Therefore, these controls are included in each disease type but are not double counted in the total number of healthy controls.
|| Other: Custom system, Nidek, Optos, Opthalmika, SVision, Optopol, Opko.# North America: United States of America and Canada; South America: Brazil, Argentina; Europe: United Kingdom, Italy, Germany, Spain, The Netherlands, Portugal, Belgium, Poland, Czech Republic, Greece, and Switzerland; Asia: China, India, South Korea, Singapore, Taiwan, and Hong Kong; Middle East: Iran, Turkey, and Israel; Oceania: Australia and New Zealand.review in 2019 found that vascular FD was decreased in four case-control fundus imaging studies of AD. 27 In addition to reduced FD, one study found increased vessel tortuosity and narrowed venular caliber in patients with AD, although a separate study yielded contradictory findings. 28,29vancements in OCT-A have revealed further information regarding vascular changes in patients with AD.A 2021 meta-analysis demonstrated an enlarged foveal avascular zone and reduced macular whole enface superficial and deep vessel densities in patients with AD. 19 Notably, features of FD, vessel  AD, Alzheimer's disease; AF, autofluorescence; ALS, amyotrophic lateral sclerosis; D-US, dementia, unspecified; DVP, deep vascular plexus; FLIO, fluorescence lifetime imaging ophthalmoscopy; HD, Huntington's disease; mGCC, macular ganglion cell complex; mGCIPL, macular ganglion cell layer and inner plexiform layer; NS, not significant, NS, no statistical difference between cases and controls; pRNFL, peripapillary retinal nerve fiber layer; PD, Parkinson's disease; MCI, mild cognitive impairment; SVP, superficial venous plexus.
Databases containing fundus images of patients with MCI and D-US did not publish analysis of imaging features.* Indicates a significance value provided by the referenced meta-analysis.
† Indicates that the P value of the corresponding number of studies fell within the indicated range.
‡ Inner and outer macular sectors classified according to Early Treatment Diabetic Retinopathy Study (ETDRS) guidelines.
caliber, and vessel tortuosity that have been found on fundus imaging can also be evaluated using OCT-A, although systematic differences between modalities have been noted. 30

Parkinson's Disease
In addition to reduced dopamine levels, the retina of patients with PD also exhibits neurodegeneration.Meta-analyses of OCT studies in 2019 17 and 2020 18 revealed reduced pRNFL, mGCIPL, and macular thickness, as well as decreased macular volume in patients with PD.In addition, disease severity and duration have been shown to be linked with pRNFL thinning and decreased foveal thickness. 31,32Contradictory findings of both increased and decreased choroidal thickness have been reported, likely due to differences in image analysis. 33,34Fundus imaging studies have revealed that patients with retinal thinning compared to age-matched controls have an increased risk of developing PD. 35D has also demonstrated an impact on retinal vasculature.A 2023 meta-analysis 22 found that patients with PD had reduced microvascular density in the whole superficial vascular plexus (SVP), foveal SVP, parafoveal SVP, and foveal avascular zone (FAZ), and reduced branching complexity has also been implicated. 36[39]

Amyotrophic Lateral Sclerosis
Studies on ALS have produced conflicting results regarding retinal neurodegeneration.A 2022 metaanalysis 12 found inner nuclear layer thickness to be the only statistically significant finding in patients with ALS.However, 6 of the 11 studies within the analysis also demonstrated significant pRNFL thinning.Future studies with differentiation between ALS subtypes (bulbar-onset versus spinal-onset) may help clarify the conflicting results. 12In the only found OCT-A study assessing patients with ALS, retinal microvessel density was not significantly decreased in patients with ALS compared to controls. 40

Huntington's Disease
OCT case-control studies of patients with HD have produced conflicting results.2][43][44][45][46][47][48] One study found that temporal pRNFL thinning appeared in preclinical HD, whereas another could not replicate the finding. 44,48Disease duration and severity may be correlated with temporal pRNFL thinning and reduced macular volume. 42Two studies using OCT-A found no difference in vessel density. 41,46 total of 115 patients with HD have been studied, suggesting that larger sample sizes may assist in ascertaining the retinal characteristics of HD.

Unspecified Dementia and Mild Cognitive Impairment
Patients with MCI have demonstrated retinal changes.A 2020 meta-analysis 13 discovered pRNFL thinning across 17 studies.OCT case-control comparisons have also demonstrated decreased mGCIPL, ganglion cell complex, macular, foveal, and choroidal thickness, and reduced macular volume, although all studies did not share the same findings.GCIPL thickness, 49 choroidal thickness, 50 pRNFL thickness, [51][52][53] and the pRNFL granular membrane area 54 have been shown to be inversely correlated with cognitive performance, whereas other studies show no correlation. 55ecreased mGCIPL thickness has also been correlated with a reduction of white matter in the fornix of patients with MCI. 56An OCT-A study discovered reduced microvascular density in the superior-nasal region of patients with MCI, but another study did not find any differences. 57,58ew studies have investigated the relationship between the retina and unspecified dementia.Ferrari et al. 14 found pRNFL thinning and GCIPL loss, with GCIPL loss being connected with AD severity.On the other hand, Pillai et al. 59 found no statistical significance in RNFL thickness, GCIPL thickness, or macular volume between patients with unspecified dementia and healthy controls.Further studies will help explore the impact of unspecified dementia on the retina.

Strategies Used for Retinal Image Analysis in Patients With Neurodegenerative Disease
As discussed above, recent studies have elucidated various features found in retinal imaging that are associated with neurodegenerative diseases.On the other hand, several studies have used machine learning models to detect or otherwise learn more about these diseases using retinal images alone, without any a priori knowledge.Table 3 provides an overview  • AUC for D-US prediction on validation set: 0.86.

images
Color fundus • Histogram of oriented gradient was used for feature extraction.
• Single-center data lacking external validation.
• Cross-sectional rather than longitudinal data.
•  of studies using deep learning algorithms and other predictive models for retinal image analysis in neurodegenerative disease, whereas Supplementary Table S2 contains additional details for each model.Deep learning models can detect features from unstructured data to make predictions using that data, with no guidance apart from data examples.The most prevalent deep learning algorithm for image analysis is known as the convolution neural network (CNN).In retinal images of patients with neurodegenerative disease, some of the features learned by CNN models might be features already described in the scientific literature.Some may be observable features that have not yet been described, and others may be features that are even too subtle for a human observer to detect.CNNs have shown great promise for automating decision-making tasks using retinal images in patients with neurodegenerative disease.
An excellent review recently detailed the computational strategies for using CNNs for retinal image analysis in general, which are applicable to the current discussion.The authors outlined the following stepwise elements in the overall framework for implementing a deep learning model for retinal image analysis.These include: (1) image acquisition and annotation, (2) retinal image preprocessing, (3) model architecture and design, (4) model training, (5) generating model predictions, and (6) evaluating model performance. 60For a more detailed explanation of these stages, we refer the reader to this other review. 60In the current work, we will discuss specific architectural elements as they pertain to the problem of retinal image analysis for neurodegenerative disease, and their past and future applications.

CNN Models: Basic Architectures and Applications
The architecture of a neural network refers to the arrangement of computational steps, which are known as layers.CNNs share a basic unit, the convolution layer, which is used to extract information from images.Computations in convolutional layers are fed forward in series to other types of computational layers including pooling and fully connected (FC) layers, and also other downstream convolution layers.The most common types of computational tasks performed by CNNs are classification, regression, and segmentation.The architecture of VGGNet, a common CNN used for retinal image analysis, is shown in Figure 2.

Classification
The structure of a basic CNN for the task of image classification usually involves several repeated units which consist of one or more convolution layers followed by a pooling layer.With each iteration of units, the resolution of the output decreases.Feedforward skip connections are used throughout to preserve important spatial data from earlier steps.Finally, an FC layer is used to transform the spatial image data into a structured classification output.Classification models have been the most common type of CNN among publications on retinal image analysis in the context of neurodegeneration.Accuracy and area under the receiver operator curve (AUC) are the most popular methods for measuring the performance of these models, but binary accuracy, sensitivity, and specificity are also used.
Disease detection is the most basic classification task; many different models have been constructed to take images of a fundus as input and provide as output a binary label distinguishing whether the image is from a patient with a neurodegenerative disease or from a healthy control.Attempts at detecting AD, PD, and general cognitive impairment using fundus photographs have been described previously, with varied results.CNN models trained on large fundus datasets have demonstrated improved performance compared to models trained on small datasets.For example, a model trained on nearly 13,000 images had an AUC of 0.91, 61 whereas a model trained on less than 300 images had an AUC of 0.83. 62Ongoing work is being done to engineer these models to improve their diagnostic accuracy.Multi-class disease detection models have also been constructed.These aim to detect one of several potential disease states; for example, one study used support vector machines (a different type of machine learning model) to classify patients as having AD, PD, or neither. 63Classification is not limited to the diagnosis of disease and may involve multiple classes representing any variable of interest.For example, one study devised a method for classifying the localization of age-related white matter changes to one of six overall brain regions using bilateral fundus photos fed to a CNN, followed by a regression and decision tree for brain region classification. 64This approach was able to classify the location of white matter lesions into one of 6 potential regions: left and right frontal lobes, parietal-occipital lobes, or basal ganglia using fundus photographs, with an AUC of 0.955 based on 10-fold cross-validation.
Risk stratification is another classification task which involves labeling imaging as belonging to one or two groups of varying risk.Many such studies have attempted to predict the presence of a known risk marker.For example, one model was trained to detect the ApoE4 genotype using fundus photographs.However, this model was unsuccessful with an AUC of 0.47, which may have been due to the low representation of ApoE4 individuals in the dataset, the model structure itself (although the model had good performance on age and sex predictions), or alternatively the inability of fundus photography as a modality to capture information about retinal amyloid deposits. 65

Regression
A very similar type of CNN model can perform a regression task using image inputs.Regression models have the same general framework as classification models, except for the final computational layer.In a regression model, the final layer outputs a numeric variable (rather than categorical variables used in classification).The correlation value R is most commonly used to describe what percent of the variability in the dependent variable can be explained by the model.
For instance, one model attempted to predict cognitive scores in a cohort of aging adults, although the model was only able to explain 22% of cognitive scores (R = 0.22). 65Another study with a similar objective of predicting the Cardiovascular Risk Factors, Aging, and Incidence of Dementia (CAIDE) yielded a more useful model, with R = 0.76 on an external validation set and an AUC of 0.926 for the detection of high dementia risk defined as a score >10. 66One of the more striking successful examples of regression using CNNs is the estimation of a person's age using a fundus photograph, a biomarker that is known as retinal age, with R = 0.81 in one study. 67,68Other potential applications of regression are hazard and time-risk models.For instance, one model used the retinal age gap (the difference between retinal age and true age) and other demographic data as input into a Cox proportional hazards model to estimate the 5-year incidence of PD. 35

Segmentation and Object Detection
Image segmentation is a different kind of task that uses a very different model architecture.The purpose of a segmentation algorithm is to use an input image to create a segmentation map, which is an image highlighting every pixel from the input image that is associated with a structure of interest.Similar to other CNN models, segmentation models use blocks of convolutional and pooling layers.In a segmentation model, intermediate convolution layers may scale down the resolution of the image data, but up-convolution layers are used to increase the output resolution (back to the original resolution in many cases).Feed-forward skip connections are used to preserve important spatial data from the higher-resolution steps to create the segmentation map.
Segmentation of the retinal vessels is one of the most important examples of this type of algorithm, as the vascular anatomy contains valuable information regarding the neurodegenerative disease, as noted above.However, segmentation of retinal vessels has also been described as one of the most challenging tasks in retinal image processing. 69Several segmentation algorithms have been developed over the past few years with the objective of generating images of the retinal vascular tree from fundus images, with all other details removed from the image.A vascular segmentation algorithm for OCT-A images has also been developed recently. 70In practice, these models can be used to extract from the original image an image of white vessels only on a black background (or vice-versa).Many studies have used pre-existing vessel segmentation algorithms to process input images and use the resulting vessel map alongside the original image as a second input to a classification or regression model.A standardized method has recently been described for evaluating and comparing the accuracy of vessel maps.Using this method, a per-pixel (vessel or no vessel present in this pixel) AUC can be calculated by averaging over every pixel in an entire image set. 70bject detection shares some similarities with segmentation but also has some elements of a classification model.These algorithms can detect and quantify one or more features in a fundus photograph.For example, the automated retinal image analysis (ARIA) algorithm has been used to detect and count arteriolevenous nicking, arteriole occlusions, hemorrhages, and exudates, 64 and these results can be used as input features in a model.

Architectural Modifications and Other Tools for Increasing Model Performance
Presently, most studies using retinal images of patients with neurodegenerative disease have been relatively small in size (in the tens to hundreds of data examples) compared with datasets more commonly used for commercial deep learning applications (often in the thousands to over a million data examples).Ample large-scale datasets exist for diseases such as diabetic retinopathy or glaucoma, but there is a scarcity of available retinal image data for patients with neurodegenerative disease.There are some moderately sized datasets for AD and several small datasets that could potentially be pooled.However, sizable retinal image datasets from patients with other common neurodegenerative diseases with lower prevalence like ALS, HD, or frontotemporal dementia are greatly lacking.
The term "Hand-engineering" refers to the use of a priori knowledge to intentionally construct a model with specific elements to increase its likelihood of recognizing features that are already known to be important in the task to be automated.Hand engineering a model by incorporating various deep learning tools into the model architecture can increase the predictive power of CNN models, especially for small datasets, although these methods can also increase the accuracy of models trained on large datasets as well.
In this section, we discuss a few of the most important deep learning tools currently available to allow for the harnessing of a priori knowledge, or otherwise increase the power of models used to analyze retinal images from patients with neurodegenerative disease.

Image Preprocessing and Feature Extraction
One of the simplest means of hand engineering can be preprocessing data to enhance specific features.Different combinations of image color, contrast, noise, or sharpness modifications have proven helpful for improving the visualization of darker vessels, the brighter optic nerve head, or retinal background lesions, for example. 60In small datasets in which images from only one eye are input at a time, it can be helpful to horizontally flip all images of left eyes to match the right eyes to increase dataset homogeneity.
Increasing the size of a dataset, known as data augmentation, can also improve model performance.For image datasets, this is commonly done using image transformations.The addition of randomly flipped, rotated, cropped, zoomed, or blurred copies of existing images in the dataset can improve model performance.Non-random cropping can also be useful, to focus on certain regions of interest, such as the optic nerve head or macula. 60re-existing deep learning algorithms can also be used to transform an image during preprocessing.Several powerful retinal vessel segmentation algorithms have been developed recently.These tools have the ability to generate segmentation maps isolating the arteries, veins, or both. 69Other segmentation algorithms can detect and create labeled maps of pathologic features including hemorrhages, microaneurysms, exudates, or retinal neovascularization. 71hese algorithms were specifically developed with the small-dataset problem in mind.Even a small-scale model can obtain key lesion information reliably if the inputted images are preprocessed by a pre-existing model that can detect these lesions accurately.
Segmentation algorithms can also be used in the preprocessing of OCT images to identify the various retinal layers or create maps of their spatial thickness distribution.For example, segmentation of OCT layers can be used to generate several 2D maps of the thickness of each retinal layer, which can be used as input into a CNN.This is important feature information, given the known association between neurodegenerative disease and retinal layer thinning.
Feature extraction is the practice of using computations on original input data, which may include preexisting neural network models, but also simpler calculations, to extract desired data features.This can be done with pretrained CNNs specialized for extracting a specific type of feature of interest (i.e.vascular information).For example, retina-based microvascular health assessment system (RMHAS) is an algorithm that can extract the following features from a retinal image: average vessel diameter, average vessel length, fractal dimension, branching angles, tortuosity, branching coefficient, asymmetry ratio, junctional exponent deviation, and angular asymmetry. 72Several other models are available for extracting various quantitative features from retinal images.The extracted feature data can then be used as input to a model.Occasionally, when features of interest are unknown, feature extraction can be done using the pretrained weights from early layers of a general image classifier (such as ImageNet), although the outputs from this form of feature extraction are difficult to correlate with any physiological characteristics.
One potential pitfall of both preprocessing and feature extraction is the loss of data due to the transformation of the original image data.For example, cropping to focus on the optic disk would result in a loss of data for the entire rest of the fundus.Likewise, inputting only segmented images of the vascular tree would neglect the rest of the retinal background.In addition, by the same token, inputting only structured data obtained from feature extraction would neglect any other features potentially contained in the image.For larger datasets, it is usually best to use the original images as parallel inputs together with the preprocessed images to avoid the loss of features within the original image, although for smaller datasets this may not negatively impact model performance.The methods for merging parallel inputs, known as feature fusion, will be discussed in another section below.

Transfer Learning
In the previous section, we discussed how the use of pre-existing models can be a powerful tool when hand-engineering a model.This is also true for training the model itself.The task of basic recognition of curves, outlines, and shapes is not trivial and must be learned by a naive model.Rather than training a naive model, the more common practice is to begin with a pretrained network, load its learned weights, and then fine-tune the weights of a few final layers during the specialized training for the specific task at hand.Importantly, the use of transfer learning was shown to yield better results compared with naive model training for predicting systemic information using retinal images, even for naive models trained on very large datasets. 73xamples of common pretrained CNN models include ResNet (Microsoft), VGGNet (Oxford), ImageNet (ImageNet), EfficientNet (Google), and GoogLeNet/ Inception (Google).
Using pretrained networks can be problematic, however.Pretrained CNNs tend to favor features that were important for classifying images from their original training datasets.For example, many pretrained CNNs are known to favor texture features over shape features. 74Although texture, or local patterns distributed over a small area, seems to be very important for neurodegenerative disease classification, 75 it may be a pitfall to neglect key shape information, which is associated with objects across a larger area in the image.One solution to this problem is the use of pretrained networks that prioritize shape, such as Stylized-ImageNet, which used different training data to intentionally emphasize shape features over texture.

Feature Fusion
Feature fusion is the merging of features from different branches or different layers in a model so that the model will consider them together in one or more computational steps.Commonly, this is accomplished by simple concatenation (appending) or summation (adding).Feature fusion can be used to combine parallel branches (i.e.fuse patient metadata with a fundus image) or within elements (i.e. between convolution layers downstream of a single input image).
One important application of feature fusion to the current discussion is the simultaneous consideration of bilateral fundus images.It seems particularly important to consider data from both eyes when using fundus images to predict extraocular disease states, such as neurodegenerative disease.Several recent studies have described various strategies of feature fusion for the analysis of bilateral fundus images.Recently, a powerful model using feature fusion of bilateral fundus images achieved >99% precision and >99% recall classifying 8 different disease classes, including systemic diseases like diabetes and hypertension using retinal images from the ODIR dataset (n = 5000). 76 Another possibility with feature fusion is the parallel use of metadata which may include demographic information, like sex, age, ApoE genotype, cardiovascular laboratory markers, or potentially any information from the medical history.These data seem especially important for giving systemic contextual information and has proven useful in several models.For example, the use of metadata fused with imaging by simple concatenation improved the accuracy of skin cancer classification using dermoscopy images (see Figure 3). 77Similarly, the detection of anemia using fundus imaging together with metadata was significantly more accurate compared to either alone. 78

Attention Mechanism for CNNs
The attention mechanism is a deep learning tool that was first developed in language models, giving them the ability to maintain attention to key parts of a sentence that were closely linked either grammatically or semantically despite being distant from one another (many words apart within a sentence or paragraph).For CNNs, the mechanism of spatial attention is the ability to give more weight to certain regions of the input image and less to others, despite being distant from one another, and the attention function is learnable by the model during training rather than hardcoded.
A new method for incorporating attention into feature fusion has been described recently. 79Attentional feature fusion allows a model the ability to preferentially give attention to different features rather than giving equal attention to different features, which is the result of simple methods like concatenation and summation.

Class Activation Mapping: Shedding Light on Influential Features
A major drawback to the use of deep neural networks for image analysis has been their blackbox nature, or, in other words, their inability to explain their decision-making processes.Class activation mapping (CAM) is a method of highlighting the key features either in the original image or in any of the convolutional layers that are most influential in a model's decision and can be used for classification and segmentation tasks.This is a method of inspecting a convolution layer's implicit attention by displaying a map of the relative weights for each pixel.Self-Attention class activation maps SA-CAM 80 have recently been described as a new method by which spatial attention maps can be visualized, which is useful for displaying explicit rather than implicit model attention.
The benefit of using CAM is that it may help find meaningful clinical correlations of model behavior.In diabetic retinopathy models, attention maps have shown model attention to lesions known to be important for Early Treatment Diabetic Retinopathy Study (ETDRS) classification of diabetic retinopathy, such as microaneurysms and hemorrhages (Figure 4). 81here is currently no known classification system for neurodegenerative disease based on retinal image findings.However, CAM in a model for the detection of symptomatic AD showed that the model gave attention to areas of decreased vessel density in the fovea and temporal macula in images classified as positive for AD. 62Making sense of the features used by CNNs in their decision making will be important for linking model predictions with physiologic features.

Discussion
In the recent literature, a variety of studies have demonstrated that retinal image analysis using deep learning can provide highly useful predictions regarding neurodegenerative disease diagnosis and risk assessment with good accuracy.Using retinal fundus photographs and/or OCT images, convolutional neural network algorithms have performed impressive tasks such as automated diagnosis of AD and PD, calculation of risk of future PD or dementia, and localization of cerebral white matter disease.
Recent advances in deep learning should be considered in the development of future models.For retinal image analysis in general, transfer learning, or the use of a convolutional neural network with pretrained weights, seems to be superior to training a model with naive weights, even when a large fundus image dataset is available for training.New advances, such as attentional feature fusion, will also allow for improved ability for researchers to understand model decision making.Ongoing development of feature extraction algorithms is enabling the generation of improved maps of lesions and vasculature, as well as the generation of structured feature information such as retinal age or whole-fundus vessel caliber.However, the use of extracted feature information together with original images yields better results than extracted features alone.
Because the performance and generalizability of convolutional neural network models tend to improve as more images are used during training, the predictive power of future models is expected to increase as more retinal image data becomes available in patients with neurodegenerative disease.Several existing datasets are available and contain various imaging modalities primarily in patients with AD, although dataset size for most of these is on the order of tens to a few hundred.There is very little image data available for PD, HD, ALS, or other neurodegenerative diseases.At present, although cross-sectional imaging data is becoming more abundant, datasets with longitudinal imaging are altogether lacking.Furthermore, the use of bilateral and multimodal imaging together with metadata seems to improve model performance, thus larger multimodal datasets with patient metadata and bilateral images are also needed.
Thus far, a broad range of device manufacturers have been represented in datasets.It could certainly be a pitfall to deploy a machine learning model to interpret images acquired using devices different from those used for model training.This could lead to unpredictable model behavior.However, using images from different devices and manufacturers together in the same training set would not be expected to detract from the rigor of a model as long as the images in the test dataset were acquired using one of the devices used for training.In fact, a model trained using images from several different devices or manufacturing systems could have better generalizability than a model trained on images from a single device.
Although deep learning can potentially be a powerful tool for neurodegenerative disease, virtually all models in the current literature share a similar limitation.Features such as retinal thinning are highly nonspecific and could represent a variety of pathologies, such as glaucoma, diabetes, or other inflammatory retinopathies.In addition, whereas many reports have claimed to detect neurodegenerative diseases with high specificity and sensitivity, most of these datasets are poorly representative of a realistic clinical population.It will be important for future models to use more diverse datasets that do not exclude disease with ocular manifestations and re-evaluate whether these models can identify features differentiating true neurodegenerative disease from other diseases with retinal implications.Given these limitations, it remains uncertain whether automated retinal image analysis using machine learning algorithms will be useful for the diagnosis of neurodegenerative disease in clinical practice.

Figure 2 .
Figure 2. Architecture of VGGNet.Used with permission from Goutam et al. 2022. 60CONV, a conversion of the convolutional neural network; FC, fully connected layer; ReLU, rectified linear unit.

Figure 3 .
Figure 3.Simple approach for combining fundoscopic image processing and metadata processing.Adapted with permission from Gessert et Al.2020.77CONV, a conversion of the convolutional neural network; FC, fully connected layer; ReLU, rectified linear unit.

Figure 4 .
Figure 4. Example of attention maps used in the classification of diabetic retinopathy.Pixels in the image that are of higher relevance to model decision making are highlighted in yellow and red.Adapted with permission from Zhang et al. 2022. 81

Table 1 .
Summary of Retinal Image Datasets in Patients With Neurodegenerative Disease If not specified within the article, datasets were assumed to be available upon request (AoR) from the corresponding author.
* † Extrapolated two eyes per case if not specified.‡ Extrapolated one image per eye if not specified.

Table 2 .
Summary of Retinal Findings in Neurodegenerative Disease

Table 3 .
Deep Learning Algorithms and Other Predictive Models for Retinal Image Analysis in Neurodegenerative Disease