**Purpose**:
To identify the most suitable model for assessing the rate of growth of total geographic atrophy (GA) by analysis of model structure uncertainty.

**Methods**:
Model structure uncertainty refers to unexplained variability arising from the choice of mathematical model and represents an example of epistemic uncertainty. In this study, we quantified this uncertainty to help identify a model most representative of GA progression. Fundus autofluorescence (FAF) images and GA progression data (i.e., total GA area estimation at each presentation) were acquired using Spectralis HRA+OCT instrumentation and RegionFinder software. Six regression models were evaluated. Models were compared using various statistical tests, [i.e., coefficient of determination (*r*^{2}), uncertainty metric (*U*), and test of significance for the correlation coefficient, *r*], as well as adherence to expected physical and clinical assumptions of GA growth.

**Results**:
Analysis was carried out for 81 GA-affected eyes, 531 FAF images (range: 3–17 images per eye), over median of 57 months (IQR: 42, 74), with a mean baseline lesion size of 2.62 ± 4.49 mm^{2} (range: 0.11–20.69 mm^{2}). The linear model proved to be the most representative of total GA growth, with lowest average uncertainty (original scale: *U* = 0.025, square root scale: *U* = 0.014), high average *r*^{2} (original scale: 0.92, square root scale: 0.93), and applicability of the model was supported by a high correlation coefficient, *r*, with statistical significance (*P* = 0.01).

**Conclusions**:
Statistical analysis of uncertainty suggests that the linear model provides an effective and practical representation of the rate and progression of total GA growth based on data from patient presentations in clinical settings.

**Translational Relevance**:
Identification of correct model structure to characterize rate of growth of total GA in the retina using FAF images provides an objective metric for comparing interventions and charting GA progression in clinical presentations.

^{1}The etiology of GA remains elusive, and no drug therapies are available.

^{2}GA is a progressive disease, and vision will continue to deteriorate with a possibility of legal blindness. The deterioration in vision is associated with the growth of GA lesions. GA is defined as dead retinal pigment epithelium (RPE) and photoreceptor cells with closure of the underlying choriocapillaris.

^{2}

^{,}

^{3}They appear as sharply demarcated areas, which are traditionally identified by retinal imaging.

^{2}The appearance of GA lesions in the macular region affects vision, and the severity of vision loss is linked with the size and the location of the lesions in the macula.

^{4}

^{,}

^{5}The pattern of GA growth is not well understood, and research publications to-date have often described the trend in GA growth in qualitative terms rather than quantitatively. It would be useful to have an objective and quantitative metric for GA progression as a means of trend identification and to predict the rate of growth of GA. A predictive model would inform the patient and the clinician about disease progress and inform on the validity of any future interventions.

^{1}

^{,}

^{6}Past “qualitative” observations have suggested a possible linear progression of GA lesion area.

^{1}

^{,}

^{5}

^{,}

^{7}Dreyhaupt et al.

^{7}modeled the natural course of GA using linear and exponential mixed-effects models fitted to areas of atrophy computed from fundus autofluorescence (FAF) images.

^{7}They found that the linear and exponential models were similar in performance. The Age-Related Eye Disease Study (AREDS) Report Number 26 suggested that, at least for fundus photographs, a linear model growth was superior to a quadratic model for different lesion sizes. The AREDS study report suggested that a linear relationship may in part be related to overlapping areas of atrophy (rather than the expansion of a single lesion over time).

^{8}Despite anecdotal observations for individual lesions, there appears to be very little reported on modeling total GA progression (i.e., global growth of all lesions combined). The variability in many study findings has been attributed to errors associated with the accuracy and precision of assessment methods and therefore epistemic uncertainty also needs to be addressed in model development.

^{6}

^{9}

^{–}

^{11}In the case of GA assessment, the most relevant epistemic uncertainty is “model structure uncertainty” (i.e., identifying the correct model for disease progression). This uncertainty can be due to limited availability of comprehensive datasets, incomplete knowledge of the disease etiology and pathogenesis, and errors from measuring equipment that can lead to the presence of imprecise and uncertain data (i.e., measurement error and noisy data). This makes the process of developing an appropriate model challenging. For example, although the RegionFinder software provides a fast, consistent, and semiautomated process for area segmentation, it relies on human-user input for its function. A grader relies on judgement and experience when annotating GA lesions (Fig. 1). Methods available to assess structural problems include model checking (e.g., goodness-of-fit tests, calibration test, residual error assessment) and comparing tested predictions against independent data.

^{12}

^{,}

^{13}There appear to be no publications that have quantified epistemic uncertainties in GA measurement. Also, the authors are unaware of any publications that have incorporated uncertainty analysis when modeling the progression of GA.

^{14}First described by Delori et al.,

^{15}the FAF (excitation wavelength 488 nm, emission >500 nm) is an ophthalmic imaging technique designed to capture GA as hypoautofluorescent (black) areas with sharp borders that delineate GA lesions, and hyperautofluorescent areas (bright areas that show the main fluorophore lipofuscin, but other fluorophores exist as well) that provide insight into the health and functionality of the RPE.

^{15}

^{–}

^{18}The FAF images were acquired on the Spectralis HRA+OCT instrument. Pupils were dilated at every visit before image acquisition. FAF images with 30° × 30° field of view were captured. The dataset consisted of images in both high speed (768 × 768 pixels), and high resolution (1536 × 1536 pixels) formats. The Automatic Real-Time Tracking (ART) was also recorded for each image.

^{19}

^{–}

^{22}Published evaluations confirm that the RegionFinder algorithm is accurate, reproducible, and time-efficient for identification and quantification of lesions.

^{19}

^{,}

^{21}

^{,}

^{22}It is currently incorporated into the Spectralis instrument and used in clinical practice.

^{23}investigated the reproducibility of GA area measurements and enlargement rate of GA, including usefulness of the square root transformation. They found that it eliminated GA baseline size dependency from the GA growth rate. The correlation between lesion size and test–retest standard deviations was significant with respect to original GA area (Pearson's

*r*= 0.60,

*P*< 0.001; Spearman's ρ = 0.73,

*P*< 0.001). However, when a square root transformation of the lesion area measurements was performed prior to test–retest standard deviation calculations, the correlation between baseline lesion size and test–retest standard deviations was no longer apparent (Pearson's

*r*= 0.07,

*P*= 0.72; Spearman's ρ = 0.12,

*P*= 0.51).

^{24}quantified lesion progression using a linear mixed-effects model with two-level random effects (i.e., eye- and patient-specific effects) and shape-descriptive factors. The authors normalized the variables for the lesion area, perimeter, and circularity using the square root transformation.

^{24}Monés and Biarnés

^{25}assessed the progression of GA and its baseline using the square root transformation, with both Pearson's

*r*and Spearman's ρ in the assessment. They plotted the relationships using linear regression with locally weighted scatterplot smoothing curves; one plot compared GA area growth (mm

^{2}/year) against baseline GA area (mm

^{2}), whereas the other plotted radial growth (mm/year) against the square root transformed baseline. They found the correlation between radial growth and square root–transformed baseline area was negative (Pearson's

*r*= –0.30,

*P*= 0.0005; Spearman's ρ = –0.25,

*P*= 0.0042), which suggests that as lesions grow larger, the progression rate starts decreasing.

^{25}Domalpally et al.

^{26}studied a parameter—the Geographic Atrophy Circularity Index (GACI)—in the assessment of GA progression. They used regression analysis to assess the relationship between baseline characteristics and annual progression rates of GA. Similar to Monés and Biarnés,

^{25}they found statistically significant correlations between GACI and growth rate in mm

^{2}(

*r*= –0.31,

*P*< 0.001) and GACI and square root–transformed measurements (

*r*= –0.39,

*P*< 0.001).

^{2}/year); and (2) the square root–transformed GA area (with a unit of mm/year). We used the following criteria to determine the strength of the square root transformation for the cohort: (1) whether the square root transformation normalized the distribution of residuals from the regression models tested; (2) if the transformation linearized the growth rates, as would be expected from this type of transformation; and (3) if the transformation significantly improved the fit of the model as compared with its untransformed, original-scaled counterpart. It is important to note that, in regression analysis, the assumption of normality applies to the residuals only. The distribution of independent and dependent variables can be skewed if the residuals of the regression model are normally distributed.

*r*

^{2}, for each model for each eye (where

*r*is the correlation coefficient), and then placing the model

*r*

^{2}values into a matrix (model type × patient ID) for comparison; (3) quantifying the uncertainty metric,

*U*, for each model; (4) when models had fractional differences in terms of trends,

*r*

^{2}and

*U*, we compared the models using both Spearman's ρ and Pearson's

*r*rank correlation coefficients to see if the growth trends between the models were similar or significantly different; (5) the model that satisfied steps 1–3 was selected as the best fitting model, based on the principle of

*Occam's razor*(i.e., the most simple model for explaining the results is the preferred model); (6) we used the test of significance for the correlation coefficient,

*r*, to test whether sample data were sufficient to model the relationship in this cohort correctly for every model tested for every subject; and (7) for the best fit model, we compared the normality of residuals of the model with the original GA area scale (i.e., mm

^{2}/year) to that of the square root transformed GA area scale (i.e., mm/year) to see if they satisfied the assumptions of normality for the residuals.

*r*

^{2}values for all models were computed. These values were placed in the

*r*

^{2}matrix for comparison. For example, in the linear model,

*r*

^{2}is the proportion of the variance in the dependent variable,

*Y*, that is predictable from the independent variable,

*X*.

*r*

^{2}value. The frequency of the best model occurrence was quantified and tabulated. For the best model chosen for each patient, the uncertainty metric

*U*(Appendix, Eq. 10) was calculated. Note that

*r*

^{2}is the variability accounted for by the regression model itself, whereas the value

*U*for uncertainty is the variability not accounted for by the regression model.

*r*

^{2}and low average

*U*, that met clinical assumptions of progression, was deemed the most practical model for GA progression. We then assessed the residuals of the selected regression model (both with and without square root transformation) to see how the residuals of the model behaved and whether transformation improved the normality of residuals.

*r*, can be used to determine whether

*r*and

^{2}*U*for each model fitted is significant, where the test is adjusted for sample size (see Appendix). In addition to this, with respect to the population of GA area measurements, the Central Limit Theorem and sample size calculations can be used to support sufficiency of sample size (i.e., the sampling distribution of the mean values tends toward a normal distribution, for

*n*> 30).

^{2}(range: 0.11–20.69 mm

^{2}). The average change from last GA measurement to baseline GA area was 6.07 ± 4.99 mm

^{2}. The mean age at baseline was 76.84 ± 8.60 years (range: 55–96 years). The cohort was generally a diverse representation of GA progression, with a mixture of slow and fast progressors. The median ART was 35 (IQR: 25, 75; range: 5–100).

*r*

^{2}

*r*

^{2}matrix. That is, the models were assessed with and without transformation, with dimensions of the matrix being 2 (outcomes) × 5 (models) × 81 (eyes) with 810 ×

*r*

^{2}values. The average

*r*

^{2}values for each eye were calculated for the five models (Table 1). The linear, logarithmic, and Q2 models were the best candidates for GA growth modeling based on average

*r*. Further examination of the logarithmic, linear, and Q2 models graphically revealed that the gradients were indistinguishable. To illustrate further, consider the case illustrated in Figure 2B. For this case, the linear and logarithmic models both had a coefficient of determination of

^{2}*r*

^{2}= 0.9995, whereas the Q2 model had

*r*

^{2}= 0.9994. A paired

*t*-test of comparison for the 81 eyes showed that there was no statistically significant difference between the linear and logarithmic models (level of significance:

*P*< 0.05). The correlation coefficients, Spearman's ρ and Pearson's

*r*, were used to compare the slopes of the models, resulting in ρ = 1 and

*r*= 0.9999. There were small differences in the coefficient of determination and the patterns of progression were similar. The linear model was deemed preferable, as it was similar to the logarithmic and Q2 models with respect to average

*r*, and it displayed the lowest average

^{2}*U*. Implementation and interpretation of the linear model is simple, and the linear gradient is a direct measure of rate of GA progression. The parameters of the logarithmic and Q2 models do not have straightforward interpretations.

*U*shows more explicitly the unexplained variability associated with the regression model, that is, uncertainty that is the sum of statistical variability and model structure uncertainty. Quantification and ranking of average uncertainty for each regression model tested revealed the linear model had the lowest average

*U*value, with the lowest average uncertainty of 0.025 (Table 2).

*r*, showed statistical significance (

*P*= 0.01) for the GA original scale and the square root–transformed scale, providing support for the tabulated regression models. The problem of small sample size often encountered in ophthalmology has been a continuing issue because of the wide range of eye conditions that may often be rare in occurrence. Studies have shown that the length of the statistical confidence interval for a regression model outcome varies inversely with sample size, with the relationship flattening out for

*n*> 30.

^{27}

^{,}

^{28}For

*n*> 30, the sampling distribution of the mean values tends toward a normal distribution according to the Central Limit Theorem. In the current study, there were

*n*= 81 eyes, each having its own time-series plots of GA data, for a total of

*n*= 531 for GA data, stratified so that the individual models are fitted separately to each eye.

^{29}This study investigated the epistemic uncertainty referred to as “model structure uncertainty” associated with modeling GA trend and growth in patient data.

*r*(0.9205) over 81 eyes, and the lowest average uncertainty (

^{2}*U*= 0.025). These findings were consistent, even when the total GA area outcome was square root transformed (

*r*

^{2}= 0.9299,

*U*= 0.014). The models that displayed a nearly identical trend to the linear model were the logarithmic and Q2 models. When superimposing the trendlines, it was difficult to distinguish one from the other. A marginal difference was found between the average

*r*

^{2}produced for linear, logarithmic, and Q2 models. However, not only did the logarithmic and Q2 models have higher average uncertainty, when compared with the linear model, there were problems associated with interpretation and application.

*A*is the total area of the hypoautofluorescence as derived from the RegionFinder segmentation algorithm,

*t*is the date/time of the patient presentation,

*g*is the gradient representing the rate of growth of GA, and

*t*

_{0}is the upper bound on the date of onset. The key parameters

*g*and

*t*

_{0}are obtained from regression analysis of the history of patient presentations. The gradient parameter is an easily understood metric for GA growth for patient presentations in a clinical setting. It can be computed for small datasets and is readily incorporated into software for onscreen analytics of data recorded by the instrumentation.

*g*, which quantifies the rate of GA progression. The value of

*g*is an objective metric that quantifies the impact of the clinical intervention. A high value of

*g*identifies a fast progressor, and a low value of

*g*identifies a slow progressor. An intervention will change the gradient of the line from the time it is applied (requiring a second linear regression model from the start of the intervention). The gradient is therefore a

*figure of merit*for response to the intervention.

^{7}

^{,}

^{8}Although this proposed mechanism weakens the case for a Q1 model for growth in comparison to the alternative linear model, even a linear approximation would eventually be challenged by the finite area of the retina or limited field-of-view of imaging techniques. From a clinician's perspective, total GA cannot grow indefinitely, as the size of the retina provides a physical limit in area. The assumption is that eventually the measured growth rate will stop due to termination of patient presentations as the result of patient mortality, or no further change in visual acuity tests, or it will taper-off significantly as progression decreases asymptotically.

^{3}

^{,}

^{30}The shoulder-end of a hypothetical sigmoidal function may also be missing because of the late age and mortality of the patients (i.e., GA does not reach the physical limit of the retina area or when growth tapers asymptotically). These assumptions are consistent with clinical observations that there is an absence of data at the start and end of the GA data time-series that may indicate the presence of a sigmoidal function. In practice, there are usually only a small number of patient visits and the linear approximation may provide a reliable estimate of trend and a metric for GA progression. The key parameter is the gradient,

*g*, where a high value represents a fast progressor and a low gradient a slow progressor.

^{31}reported methods for addressing this problem for a limited sample based on using left or right eye data only, or random eye selection from each pair, but this reduces sample size. Ying et al.

^{31}also suggested the application of mixed-effects models, but with increased computational expense or processing. It was noted that, if there is intereye correlation present and it is ignored, the estimators of the regression coefficients are unbiased, but the variances of the estimators are not correct.

^{31}This means that the model average is not affected, but the variance is likely to be smaller than otherwise. The impact of intereye correlation in the current study was expected to be marginal, however, because the regression models developed for each eye in the comparison were derived independently for each eye (not from the total eye population). The regression coefficients were based only on the GA area progression scatter plot for each eye separately. After repeating the process for all eyes, the performance metric was the average value of the

*r*values of the set of models, not the variance. The median of the

^{2}*r*values for the models was also computed for the best-fit three models (linear, logarithmic, and quadratic Q2), revealing similar results to the average, with no change to the conclusions (significant because the median involves a ranking of model outputs).

^{2}*r*

^{2}, and the linear model was still lowest with respect to average

*U*for all models.

^{32}

^{,}

^{33}Extensions could include further study of the square root transformation applied to GA subgroups using the best three regression models in the comparison (linear, logarithmic, and Q2 models).

**J. Arslan,**None;

**K.K. Benke,**None;

**G. Samarasinghe,**None;

**A. Sowmya,**None;

**R.H. Guymer,**is on the advisory boards of Bayer, Novartis, Roche Genentech, and Apellis;

**P.N. Baird,**None

*Retina*. 2017; 37(5): 819–835. [CrossRef] [PubMed]

*Ophthalmology*. 2014; 121(5): 1079–1091. [CrossRef] [PubMed]

*Cochrane Database Syst Rev*. 2014(8): CD005139.

*Mol Aspects Med*. 2012; 33(4): 295–317. [CrossRef] [PubMed]

*Ophthalmology*. 2018; 125(3): 369–390. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 2020; 61(1): 2. [PubMed]

*Ophthalmic Epidemiol*. 2005; 12(6): 353–362. [CrossRef] [PubMed]

*Arch Ophthalmol*. 2009; 127(9): 1168–1174. [CrossRef] [PubMed]

*Int J Environ Res Public Health*. 2018; 15(4): 592. [CrossRef]

*Ecol Appl*. 2002; 12(2): 618–628. [CrossRef]

*Evolutionary Multi-Criterion*

*Optimization*. EMO 2005. Lecture Notes in Computer Science, vol. 3410. Berlin, Heidelberg:Springer; 2005.

*J R Stat Soc Series B Stat Methodol*. 1995; 57(1): 45–97.

*Ecol Modell*. 2002; 157(2): 313–329. [CrossRef]

*Invest Ophthalmol Vis Sci*. 2008; 49(2): 479–489. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 1995; 36(3): 718–729. [PubMed]

*Invest Ophthalmol Vis Sci*. 2017; 58(6): BIO61–BIO67. [CrossRef] [PubMed]

*Int J Retina Vitreous*. 2016; 2: 12. [CrossRef] [PubMed]

*Invest Opthalmol Vis Sci*. 2020; 61: 13. [CrossRef]

*Invest Ophthalmol Vis Sci*. 2011; 52(14): 161.

*Curr Opin Ophthalmol*. 2016; 27(3): 217–223. [CrossRef] [PubMed]

*Retina*. 2014; 34(3): 576–582. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 2017; 58(6): BIO121–BIO130. Erratum in:

*Invest Ophthalmol Vis Sci*2018;59(2):674. [CrossRef] [PubMed]

*Ophthalmology*. 2011; 118: 679–686. [CrossRef] [PubMed]

*Retina*. 2019; 39: 1527–1540. [CrossRef] [PubMed]

*Transl Vis Sci Technol*. 2018; 7(6): 40. [CrossRef] [PubMed]

*Ophthalmology*. 2013; 120(12): 2666–2671. [CrossRef] [PubMed]

*Pharmaceut Statist*. 2005; 4(4): 287–291. [CrossRef]

*Educ Psychol Meas*. 2010; 70(3): 394–400. [CrossRef]

*Soil Res*. 2015; 53(6): 592–604. [CrossRef]

*Expert Opin Ther Targets*. 2009; 13(6): 641–651. [CrossRef] [PubMed]

*Opthalmic Epidemiol*. 2017; 24(2): 130–140. [CrossRef]

*Invest Opthalmol Vis Sci*. 2020; 61: 11. [CrossRef]

*Br J Ophthalmol*. 2021; 105: 239–245. [CrossRef] [PubMed]