**Purpose:**:
To explore the performance of patient-specific prior information, for example, from structural imaging, in improving perimetric procedures.

**Methods:**:
Computer simulation was used to determine the error distribution and presentation count for Structure–Zippy Estimation by Sequential Testing (ZEST), a Bayesian procedure with prior distribution centered on a threshold prediction from structure. Structure-ZEST (SZEST) was trialled for single locations with combinations of true and predicted thresholds between 1 to 35 dB, and compared with a standard procedure with variability similar to Swedish Interactive Thresholding Algorithm (SITA) (Full-Threshold, FT). Clinical tests of glaucomatous visual fields (*n* = 163, median mean deviation −1.8 dB, 90% range +2.1 to −22.6 dB) were also compared between techniques.

**Results:**:
For single locations, SZEST typically outperformed FT when structural predictions were within ± 9 dB of true sensitivity, depending on response errors. In damaged locations, mean absolute error was 0.5 to 1.8 dB lower, SD of threshold estimates was 1.2 to 1.5 dB lower, and 2 to 4 (29%–41%) fewer presentations were made for SZEST. Gains were smaller across whole visual fields (SZEST, mean absolute error: 0.5 to 1.2 dB lower, threshold estimate SD: 0.3 to 0.8 dB lower, 1 [17%] fewer presentation). The 90% retest limits of SZEST were median 1 to 3 dB narrower and more consistent (interquartile range 2–8 dB narrower) across the dynamic range than those for FT.

**Conclusion:**:
Seeding Bayesian perimetric procedures with structural measurements can reduce test variability of perimetry in glaucoma, despite imprecise structural predictions of threshold.

**Translational Relevance:**:
Structural data can reduce the variability of current perimetric techniques. A strong structure–function relationship is not necessary, however, structure must predict function within ±9 dB for gains to be realized.

^{ 1 }Sensitivity estimates from these procedures show significant test–retest variability, the magnitude of which is inversely related to sensitivity, becoming more than half the measurement range of the perimeter in some cases,

^{ 2,3 }hampering detection of change due to disease progression. A major goal of current research is to improve the precision of sensitivity estimates in SAP while maintaining clinically acceptable test times, generally less than 10 minutes per eye.

^{ 4 }Zippy Estimation by Sequential Testing [ZEST]

^{ 5 }) are seeded with information based on population data rather than information from the specific test patient. One example of patient-specific prior information is the sensitivity estimates obtained from previous visual field tests. Seeding Bayesian procedures with the results of previous tests reduces test–retest variability in computer simulations of perimetric outcomes.

^{ 6 }Sensitivities from previous tests are used in one commercially-available staircase procedure, German Adaptive Thresholding Estimation (GATE), which has similar variability characteristics to SITA, but reduces test duration.

^{ 7 }

^{ 8–11 }Although reported relationships are not perfect, it is apparent that structural and functional measurements are not independent, and so measurements from imaging are another source of prior information that could be exploited in perimetric procedures.

^{ 12 }

*x*decibels were simulated according to psychometric functions (Ψ) modelled by cumulative Gaussian functions with response errors incorporated according to the formula

^{13}: where

*fp*is the false positive rate defining the lower asymptote of Ψ,

*fn*is the false negative rate defining the upper asymptote of Ψ,

*t*is the threshold,

*s*is the SD of the cumulative Gaussian curve defining the spread of Ψ, and

*G*(

*x, t, s*) is the value at

*x*of a cumulative Gaussian distribution with mean

*t*and SD

*s*. As the slope of frequency-of-seeing curves for SAP stimuli flattens with decreasing sensitivity,

^{14,15}

*s*in the above formula was varied with sensitivity according to a formula proposed by Henson et al.

^{15}that was based on empirical data for healthy subjects but capped at 6 dB.

*pmf*) centered on an estimate of threshold. This is in contrast to the ZEST procedures in the perimetry literature, where the

*pmf*is typically derived from population data.

^{ 3,16 }Figure 1 shows a comparison of threshold estimates and number of presentations made by SZEST with four Gaussian prior

*pmf*s to the exact same procedure with a population-derived

*pmf*(a standard ZEST procedure, the

*pmf*used is shown in Turpin et al.,

^{ 3 }Fig. 2). As would be expected, Figure 1 shows that using a prior

*pmf*weighted close to the true threshold of an observer improves procedure performance compared with a generic prior

*pmf*. However, using a prior

*pmf*weighted away from true threshold can increase error and number of presentations above the population-based ZEST procedure. In this study, therefore, we investigate how good the prior

*pmf*of SZEST must be in order to provide gains in clinical applications, where we consider SITA Standard to represent current clinical standards.

**Figure 1.**

**Figure 1.**

**Figure 2.**

**Figure 2.**

^{ 4 }The full details of the SITA algorithms are not publically available, so we use FT as a surrogate as it has similar test–retest characteristics to SITA Standard.

^{ 2,3,17–20 }SITA Standard is on average 1 to 1.5 presentations faster than FT at each location, thus, we make sure that SZEST is also faster than FT to allow our indirect comparison between SZEST and SITA to be fair.

^{ 3 }figure 1.

*pmf*is a Gaussian distribution centered on a sensitivity prediction made by a structural measure. We chose a Bayesian procedure because Bayesian-like procedures are well established in perimetry and easily incorporate different sources of prior information or different desired outcomes. For example, the weighting applied to prior information can be adjusted according to the predictive power of the information and the consequences of erroneous predictions. Staircase procedures such as FT can also be modified to incorporate prior information by altering the start point of the staircase, but this has little effect on procedure bias when the starting guess is within ±10 dB of the true value.

^{ 21 }Finally, Bayesian procedures have been shown in previous simulation studies to estimate threshold with less bias than FT.

^{ 3 }

*pmf*, likelihood function and termination criteria.

^{ 5,22 }We were primarily interested in the effect of altering the prior

*pmf*, so used the same likelihood function as in previous simulations (shown in Turpin et al.,

^{ 3 }figure 2) whose slope is derived from frequency-of-seeing curves of healthy observers to perimetric stimuli.

^{ 23 }We chose to use dynamic termination criteria as these reduce test variability.

^{ 22 }We trialled many combinations of termination criteria and prior

*pmf*SDs (data not shown), and settled on a prior

*pmf*with SD 5 dB, and a termination criterion of posterior

*pmf*SD less than 1.5 dB. This represented the best trade-off between accuracy and test duration in initial simulations where prior information perfectly predicted sensitivity. The procedure was implemented as a standard ZEST procedure.

^{ 3,6,22 }Once the procedure terminated, the expectation of the final posterior

*pmf*was rounded to the nearest integer decibel value and reported as the final threshold estimate. The simulated perimeter had a range of stimulus intensities from 0 to 40 dB, but

*pmfs*were calculated over a range extended by 10 dB either side of this so that thresholds close to the range limits of the perimeter were achievable. For full-field simulations using SZEST each location was tested independently; no growth pattern was used.

*pmf*, and the direction of the bias is towards the center of the prior

*pmf*. Hypothesis testing was performed with two-tailed Wilcoxon Rank-sum tests.

^{ 6 }

*P*< 0.001, two-tailed Wilcoxon Rank-sum test) and SZEST made significantly fewer presentations than FT (

*P*< 0.001, two-tailed Wilcoxon Rank-sum test) across all response conditions. Reductions in variability and presentation count were generally greater for damaged true thresholds than healthy true thresholds.

**Figure 3.**

**Figure 3.**

**Figure 4.**

**Figure 4.**

^{ 3,6 }When there were few response errors (black bars in Fig. 5) SZEST generally outperformed FT when SPA was within approximately ±10 to 12 dB. The position of the bars for true thresholds less than or equal to 5 dB and greater than or equal to 31 dB reflects the reduced range of possible SPA in these situations due to floor/ceiling effects. The general skew of the red and blue bars in Figure 5 reflects the effect of response errors on the performance of both procedures.

**Figure 5.**

**Figure 5.**

*pmfs*. We constructed a hypothetical situation where structural measurements predict threshold with accuracy similar to the range across which SZEST outperformed FT in Experiment One. For simplicity, a single SPA range of ±9 dB was chosen as a reasonable approximation of the range across which SZEST performs well (Fig. 5). Gaussian noise (SD 3 dB) was added to the true thresholds to generate predicted thresholds that were within approximately ±9 dB of true threshold. Any predicted thresholds below 0 dB were set to 0 dB to constrain predictions to the measurement range of the simulated perimeter.

*t*-tests. The 90% retest limits were compared for both procedures at each input (“true”) threshold and for each response error condition using two-tailed Wilcoxon Rank-Sum tests.

*P*< 0.001, two-tailed Welch's

*t*-tests). On average, SZEST made approximately one fewer presentation per location than FT (

*P*< 0.001, two-tailed Welch's t-test).

**Table 1**.

*P*< 0.001, two-tailed Wilcoxon Rank-sum tests), particularly at reduced true thresholds. The retest limits of SZEST were also more consistent in width across both the dynamic measurement range of the simulated perimeter and the different response error conditions than those for FT (Table 2). Given the decreasing slope of the psychometric function used to model patient responses at reduced thresholds, these results suggest that, under the conditions simulated, output thresholds estimated using SZEST may be less affected by the flattening of frequency-of-seeing curves with decreasing sensitivity

^{ 15 }than those estimated by FT.

**Table 2**.

**Figure 6.**

**Figure 6.**

^{ 24 }Experiment Two demonstrated this magnitude of reduction only for observers making high rates of response errors. Smaller reductions in test–retest variability may still be beneficial, though, particularly in clinical trial situations where different approaches may be taken to progression detection, such as the “wait and see” approach recently suggested by Crabb and Garway-Heath.

^{ 25 }

^{ 18,19 }Further reductions in test times are achieved by dynamically altering response window duration during the test, and estimating false responses without catch trials,

^{ 4 }but these are not relevant for our simulations. SZEST made on average about one presentation fewer per location than FT, making it similar to SITA Standard. However, since SITA Standard has similar test–retest characteristics to FT,

^{ 2,3,17–20 }SZEST is likely to have less bias and variability than SITA Standard. Procedures with patient-specific priors based on structural data (SZEST) or previous visual field data (retest minimizing uncertainty [REMU])

^{ 6 }demonstrate more consistent retest variability across the dynamic range, and across different variability conditions (Fig. 6, Table 2), potentially leading to an improved ability to detect progression at reduced sensitivities or in the presence of high rates of response errors.

*structural*prior information hinges on the availability of structural measurements that are sufficiently closely related to visual field threshold. The Hood–Kardon model

^{ 26 }relating structural and functional measurements assumes an underlying 1:1 linear relationship between retinal ganglion cell axon loss and visual field sensitivity loss in glaucoma. Hence, a 3 dB loss of visual field sensitivity (a doubling in stimulus luminance at threshold) equates to a 50% loss of axons according to their model. Similarly, a 10 dB loss of sensitivity equates to a 90% loss of axons and, therefore, more than half of the dynamic range of commercially available perimeters is determined by less than 10% of retinal ganglion cells. Given the resolution and measurement variability of current imaging devices, the variability in clinical visual field threshold measurements and the population variance in both parameters, it is not surprising that in clinical data RNFL neuronal component thickness below 10% of mean normal (>90% loss of retinal ganglion cell axons) can only predict that visual field sensitivity will be reduced by around 10 dB or more compared with mean normal.

^{ 27 }Predictions from this model would, on average, only allow SZEST to perform well in early to moderate glaucoma, though it is difficult to draw firm conclusions from these data as we require knowledge of how structural measures relate to

*true*threshold, not clinically measured threshold. A recent study has highlighted that the between–subject variation seen as scatter in current structure-function relationships is still compatible with a close underlying structure-function relationship when measurement variability in both structure and function are taken into account, suggesting the relationship between structural measurements and underlying true threshold may be strong.

^{ 28 }

^{ 29 }made predictions of ganglion cell density from visual field sensitivity in glaucoma patients. Post mortem, they made histological counts of ganglion cell density and found that their predictions were accurate to within ±8 dB, with the residuals approximately normally distributed within this range. There was a lack of deep visual field defects in their sample, so it is unknown whether their predictions maintain the same level of accuracy at lower sensitivities. Another approach to predicting visual field sensitivity from structural measures is the Bayesian radial basis model described by Zhu et al.

^{ 30 }This model predicts threshold sufficiently accurately for SZEST to perform well until visual field sensitivity falls below approximately 10 dB. Since this model was developed and tested on scanning laser polarimetry data, it might be possible to obtain improved predictions from newer, higher-resolution devices.

^{ 31,32 }but these do not adjust for individual variability. We have recently developed an anatomically-customizable model for mapping visual field locations to the optic nerve head, which we hope will aid development of more individualized perimetric tests for glaucoma.

^{ 33 }

*pmf*and then continues from that point as in a standard ZEST algorithm. Other Bayesian procedures could be used in place of ZEST, or different parameters used within SZEST. For example, if structural sensitivity predictions were found to be nonnormally distributed about the true sensitivity then a prior

*pmf*more closely related to that distribution could be used. The SD of the

*pmf*may also be altered to reflect the predictive power of a given source of prior information. Our study provides a framework for the evaluation of such modifications in the future.

**Disclosure: J. Denniss,**Heidelberg Engineering GmbH (F);

**A.M. McKendrick,**Heidelberg Engineering GmbH (F);

**A. Turpin,**Heidelberg Engineering GmbH (F)

*. 2005; 88: 73– 80. [CrossRef] [PubMed]*

*Clin Exp Optom**. 2002; 43: 2654– 2659. [PubMed]*

*Invest Ophthalmol Vis Sci**. 2003; 43: 4787– 4795. [CrossRef]*

*Invest Ophthalmol Vis Sci**. 1997; 75: 368– 375. [CrossRef] [PubMed]*

*Acta Ophthalmol Scand**. 1994; 34: 885– 912. [CrossRef] [PubMed]*

*Vision Res**. 2007; 48: 1627– 1634. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2009; 50: 488– 494. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2006; 47: 2889– 2895. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2007; 48: 3662– 3668. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2011; 52: 8732– 8738. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 40: 369– 380. [CrossRef] [PubMed]*

*Clin Experiment Ophthalmol**. 2010; 72: 2003– 2012. [CrossRef] [PubMed]*

*Atten Percept Psychophys**. 1995; 35: 2503– 2522. [CrossRef] [PubMed]*

*Vision Res**. 1993; 34: 3534– 3540. [PubMed]*

*Invest Ophthalmol Vis Sci**. 2000; 41: 417– 421. [PubMed]*

*Invest Ophthalmol Vis Sci**. 1999; 76: 588– 595. [CrossRef] [PubMed]*

*Optom Vis Sci**. 1999; 40: 1152– 1161. [PubMed]*

*Invest Ophthalmol Vis Sci**. 1998; 76: 165– 169. [CrossRef] [PubMed]*

*Acta Ophthalmol Scand**. 1998; 76: 368– 375.*

*Acta Ophthalmol Scand**. 1999; 237: 29– 34. [CrossRef]*

*Graefes Archive Clin Exp Ophthalmol**. 1992; 33: 2966– 2974. [PubMed]*

*Invest Ophthalmol Vis Sci**. 2005; 82: 981– 987. [CrossRef] [PubMed]*

*Optom Vis Sci**. 2001; 42: 1404– 1410. [PubMed]*

*Invest Ophthalmol Vis Sci**. 2011; 52: 3237– 3245. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 53: 2770– 2776. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2007; 26: 688– 710. [CrossRef] [PubMed]*

*Prog Retin Eye Res**. 2009; 50: 4254– 4266. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 250: 1851– 1861. [CrossRef] [PubMed]*

*Graefes Arch Clin Exp Ophthalmol**. 2006; 124: 853– 859. [CrossRef] [PubMed]*

*Arch Ophthalmol**. 2010; 51: 5657– 5666. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2009; 49: 2157– 2163. [CrossRef] [PubMed]*

*Vision Res**. 2009; 50: 3249– 3256. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 53: 6981– 6990. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci*