Procedures were simulated using similar techniques to previous studies.
1,2,20,21,25,26 Briefly, the interpolated empirical visual field data acted as input “true” thresholds at every location on the 1.5° grid for simulation. For each visual field location of each simulated patient, the probability of responding to a presented stimulus of intensity
x was modeled with the function
\begin{eqnarray*}
{{P}}( {response} ) = 1-{\textit{FN}}-( {1 - {\textit{FN}}-{\textit{FP}}} ) \times {{G}}( x, t, s )
\end{eqnarray*}
in which
G(
x,
t,
s) represents the value at
x of a cumulative Gaussian function with a mean equal to the assumed true threshold
t and standard deviation
s. False-negative and false-positive response rates are given by
FN and
FP, respectively. The spread of the function (
s) varied with the input threshold (
t) according to a function given by Henson et al.,
13 except that the spread was capped at 6 dB to better match more recent empirical data:
27,28 \begin{eqnarray*}s = {\rm{min}}\left( {{\rm{exp}}\left( { - {\rm{0}}{\rm{.081}}t + {\rm{3}}{\rm{.27}}} \right){\rm{,}}\,{\rm{6}}} \right)\end{eqnarray*}
Each procedure was simulated 200 times per simulated patient (
n = 97, therefore 19,400 total simulated visual fields) to generate distributions of output visual fields for comparison.
We compared Full Threshold with SpaBS and STAMP under response error conditions we estimated to be typical for naïve observers, or slightly above average for experienced observers.
29 The three procedures were simulated with 5% false-positive responses and 3% false-negative responses.
Although SpaBS and Full Threshold have implicit stopping criteria, STAMP does not and would therefore continue indefinitely without a predetermined stopping criterion being imposed. Stopping criteria could be chosen based on entropy (overall or pointwise), but in this study we chose to employ a simple criterion of stopping after a preset fixed total number of presentations across all locations. We simulated STAMP stopping after various numbers of total presentations, chosen to be approximately 50%, 60%, 70%, 80%, and 100% of the median number of presentations we estimated SITA Standard would make on these patients. This was calculated as the median number of presentations made by Full Threshold minus one per location.
In order to evaluate performance under other response error conditions, Full Threshold and the best performing of the new procedures were also simulated with no response errors (false-positive rate and false-negative rate both 0%) and high false-positive errors (false-positive rate 15%, false-negative rate 3%).