October 2021
Volume 10, Issue 12
Open Access
Methods  |   October 2021
Quantifying Uncertainty of the Estimated Visual Acuity Behavioral Function With Hierarchical Bayesian Modeling
Author Affiliations & Notes
  • Yukai Zhao
    Center for Neural Science, New York University, New York, NY, USA
  • Luis Andres Lesmes
    Adaptive Sensory Technology Inc., San Diego, CA, USA
  • Michael Dorr
    Adaptive Sensory Technology Inc., San Diego, CA, USA
  • Zhong-Lin Lu
    Division of Arts and Sciences, NYU Shanghai, Shanghai, China
    Center for Neural Science and Department of Psychology, New York University, New York, NY, USA
    NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai, Shanghai, China
  • Correspondence: Zhong-Lin Lu, Division of Arts and Sciences, NYU Shanghai, 1555 Century Avenue, Shanghai 200122, China. e-mail: zhonglin@nyu.edu 
Translational Vision Science & Technology October 2021, Vol.10, 18. doi:https://doi.org/10.1167/tvst.10.12.18
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Yukai Zhao, Luis Andres Lesmes, Michael Dorr, Zhong-Lin Lu; Quantifying Uncertainty of the Estimated Visual Acuity Behavioral Function With Hierarchical Bayesian Modeling. Trans. Vis. Sci. Tech. 2021;10(12):18. doi: https://doi.org/10.1167/tvst.10.12.18.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: The goal of this study is to develop a hierarchical Bayesian model (HBM) to better quantify uncertainty in visual acuity (VA) tests by incorporating the relationship between VA threshold and range across multiple individuals and tests.

Methods: The three-level HBM consisted of multiple two-dimensional Gaussian distributions of hyperparameters and parameters of the VA behavioral function (VABF) at the population, individual, and test levels. The model was applied to a dataset of quantitative VA (qVA) assessments of 14 eyes in 4 Bangerter foil conditions. We quantified uncertainties of the estimated VABF parameters (VA threshold and range) from the HBM and compared them with those from the qVA.

Results: The HBM recovered covariances between VABF parameters and provided better fits to the data than the qVA. It reduced the uncertainty of their estimates by 4.2% to 45.8%. The reduction of uncertainty, on average, resulted in 3 fewer rows needed to reach a 95% accuracy in detecting a 0.15 logMAR change of VA threshold or both parameters than the qVA.

Conclusions: The HBM utilized knowledge across individuals and tests in a single model and provided better quantification of the uncertainty of the estimated VABF, especially when the number of tested rows was relatively small.

Translational Relevance: The HBM can increase the accuracy in detecting VA changes. Further research is necessary to evaluate its potential in clinical populations.

Introduction
Visual acuity (VA) is the most important functional vision metric in the clinic.1 Precise assessment of VA is of paramount importance for accurate diagnosis of eye diseases, monitoring disease progression, and evaluating treatment efficacy, as well as setting classification and qualification standards in many sports and professions.28 However, VA measurements are intrinsically noisy and the estimated VA scores carry a lot of uncertainty. This is because human VA behavior is probabilistic in nature.912 Given the exact same optotypes, the responses from a human observer may vary from trial to trial because of the various noises in the perceptual system.1316 In this study, we apply the Hierarchical Bayesian Modeling (HBM) approach to improve quantification of uncertainty in VA assessments. 
It is well known that human behavior in optotype identification can be modeled by the visual acuity behavioral function (VABF; Fig. 1A), that is, the probability of correct optotype identification as a function of optotype size. The function has two parameters, VA threshold, the optotype size required to reach a certain performance level, and VA range, which specifies how fast acuity behavior changes with increasing or decreasing optotype size.1720 Although traditional VA assessments have only focused on estimating VA threshold, both VABF parameters may vary across individuals and disease stages.18,2123 A complete characterization of the VABF requires a joint estimate of both parameters.24,25 
Figure 1.
 
VA behavioral function (VABF) and uncertainty in VA assessments. (A) A VABF with two parameters: VA threshold and range. (B) Estimated VABFs from repeated ETDRS tests, which sample the theoretical VABF with a number of optotype sizes, each tested five times (circles). (C) A color density plot of the point VABF parameter estimates across repeated ETDRS tests. (D) Data (circles) from one ETDRS assessment and the best fitting VABF (curve). (E) Probability distribution of the estimated VABFs from D. (F) The joint probability distribution of the estimated VABF parameters from D.
Figure 1.
 
VA behavioral function (VABF) and uncertainty in VA assessments. (A) A VABF with two parameters: VA threshold and range. (B) Estimated VABFs from repeated ETDRS tests, which sample the theoretical VABF with a number of optotype sizes, each tested five times (circles). (C) A color density plot of the point VABF parameter estimates across repeated ETDRS tests. (D) Data (circles) from one ETDRS assessment and the best fitting VABF (curve). (E) Probability distribution of the estimated VABFs from D. (F) The joint probability distribution of the estimated VABF parameters from D.
The VABF is a theoretical construct that specifies the probability of correct optotype identification at every possible optotype size. In actual VA assessments, the function is only sampled with a finite number of optotype sizes, and each with a small number of trials. As a result, the empirical VABFs and the corresponding VABF parameters vary across repeated assessments (Figs. 1B, 1C). Figure 1C is a color density plot of the point estimates across multiple tests. It illustrates the joint probability of different point estimates of VABF parameters from many repeated measurements. In this scheme, each VA assessment results in a single estimate of the VABF parameters, so repeated assessments are required to quantify the uncertainty. 
In the real world, where repeated VA assessments are most often not possible, how do we gauge uncertainty from a single VA assessment? This can be done with several modeling approaches.26 Given the results from a single VA assessment (Fig. 1D), the probability distribution of VABF can be estimated by their likelihood or Bayesian inference (Fig. 1E). The corresponding parameters and the associated probabilities form a joint distribution of the estimated VABF parameters (Fig. 1F). The mean and spread of the joint distribution can be used to quantify the estimated parameters and their uncertainties,2630 which is often quantified by the half width of the 68.2% credible interval (68.2% HWCI) of the distribution,26,27 defined as the smallest interval that contains the true value of the estimated quantity with 68.2% probability. The estimated uncertainties from repeated tests (see Fig. 1C) and a single measurement (see Fig. 1F) are equivalent under ideal experimental conditions.26,30,31 
Traditional VA assessments, including the printed VA charts and computerized tests, sample the VABF with optotypes that are 0.1 logMAR apart, each tested on a row with a small number of optotypes.3234 The method of limits and heuristic termination rules are used to determine a single VA score (but see Ref. 35). Although uncertainty is usually not estimated, studies with repeated tests found large uncertainties (approximately 0.07–0.34 logMAR)33,3643 because of the coarse sampling of the VABF as well as the use of heuristic termination rules that depend on the probabilistic behavior of the human observer. In addition, the exact performance level (probability of correct optotype identification) associated with each heuristic termination rule depends on the range parameter of the VABF.24,25 We cannot meaningfully compare VA scores from patients with different VA range parameters or interpret changes of VA scores of the same patient with range parameter changing between assessments.18,19 
The quantitative visual acuity (qVA) method24,25 is a Bayesian inference procedure developed to characterize the full VABF and quantify the uncertainty in each assessment of individual i in test j. In the qVA, the VABF (see Fig. 1) is characterized with parameter θij = (αij, βij), where αij is VA threshold, corresponding to the \(d^{\prime}\) = 2 performance level, βij is the range parameter of the function that covers \(d^{\prime}\) = 1 to \(d^{\prime}\) = 4 performance levels.24,25 The qVA incorporates high density optotype size sampling and starts with a broad joint prior distribution of θij, representing existing knowledge of the general pattern of the VABF (Fig. 2A). An active learning approach is used to optimize stimulus selection to reduce the expected uncertainty of the posterior distribution of θij in each trial (see Fig. 2). Bayes’ rule is used to compute the joint posterior distribution of θij (Figs. 2B, 2C), which allows us to quantify not only the VABF parameters but also their uncertainties from a single measurement. In computer simulations, we showed that the qVA could assess VABF parameters with virtually no bias, and very small uncertainty in αij (HWCI = 0.028 logMAR) and small uncertainty in βij (HWCI = 0.092 logMAR), reflecting 52.5% and 49.5% uncertainty reductions of the estimated αij relative to the electronic-early treatment diabetic retinopathy study (E-ETDRS) and Freiburg Visual Acuity and Contrast Test (FrACT) methods, respectively.25 The results were confirmed in a psychophysical study: estimated θij from the qVA exhibited very small (αij = 0.019 logMAR HWCI) and relatively small (βij = 0.062 logMAR HWCI) uncertainty. In addition, we found a significant correlation (r = 0.412, P < 0.001) between estimated αij and βij across individuals and tests in the psychophysical experiment.25 
Figure 2.
 
Evolution of the joint distributions of VABF parameter θij of individual i in test j, from prior (A) to posterior probability distributions after 15 (B), and 45 (C) rows in a qVA test.
Figure 2.
 
Evolution of the joint distributions of VABF parameter θij of individual i in test j, from prior (A) to posterior probability distributions after 15 (B), and 45 (C) rows in a qVA test.
In its current implementation, each qVA test (Fig. 3A) starts with a broad prior. It does not take advantage of any knowledge across individuals and tests such as the observed correlation between αij and βij,24,25 which could be useful in further constraining the estimated θijs and thereby reducing their uncertainties. In other words, the broad prior and independent treatment of each qVA test may overestimate the uncertainty of the estimated VABF. Although the qVA has greatly reduced the uncertainty of αij estimates, the uncertainty of βij estimates is still relatively large.25 
Figure 3.
 
(A) The qVA models the VABF of individual i in test j with two parameters: VA threshold αij and range βij, θij = (αij, βij). (B) A three-level hierarchical Bayesian model (HBM) of the VABF across multiple individuals and tests. At the population level, µ and Σ are the mean and covariance hyperparameters of the population. At the individual level ρi and φ are the mean and covariance hyperparameters of individual i. At the test level, θij is the VA threshold and range of individual i in test j. The same generative model in the qVA, the VABF (θij), determines the likelihood of response rijk of individual i in trial k of test j.
Figure 3.
 
(A) The qVA models the VABF of individual i in test j with two parameters: VA threshold αij and range βij, θij = (αij, βij). (B) A three-level hierarchical Bayesian model (HBM) of the VABF across multiple individuals and tests. At the population level, µ and Σ are the mean and covariance hyperparameters of the population. At the individual level ρi and φ are the mean and covariance hyperparameters of individual i. At the test level, θij is the VA threshold and range of individual i in test j. The same generative model in the qVA, the VABF (θij), determines the likelihood of response rijk of individual i in trial k of test j.
The observed correlation between estimated αij and βij across individuals and tests in our previous study25 motivated us to develop a three-level HBM (Fig. 3B) to quantify uncertainties at the population, individual, and test levels in a single model. At the population level, p(η), the joint distribution of VABF hyperparameter η across all the individuals, is modeled as a mixture of 2-dimensional Gaussian distributions with mean µ and covariance Σ, which have distributions p(µ) and p(Σ). At the individual level, pi|η), the joint distribution of VABF hyperparameter τi of individual i across all the tests performed by the individual, is modeled as a mixture of 2-dimensional Gaussian distributions with mean ρi and covariance φ, which have distributions pi|η) and p(φ). pi|η) denotes that ρi is conditioned on η. At the test level, piji), the joint distribution of the VABF parameters of individual i in test j, θij, is conditioned on τi. The cross- and within-individual regularities are modeled as covariances Σ and φ in the HBM.4448 Bayes’ rule is then used to update the joint prior distribution of all the parameters and hyperparameters to their joint posterior distribution,4953 which allows us to quantify the estimated hyperparameters and parameters as well as their uncertainties at the population, individual and test levels.49,54,55 
The HBM has been used in many different disciplines to model data with hierarchical structures, including astronomy,56 ecology,57,58 genetics,59 and cognitive science.50,51,53,6065 By decomposing the variability of an entire dataset into distributions at multiple levels of the hierarchy, it can better quantify uncertainty at each level.59,52,65 Here, we developed an HBM and evaluated its performance relative to that of the qVA using the qVA data in 14 eyes tested in 4 Bangerter foil conditions.25 Our goal was to improve quantification of uncertainty at the test level. We hypothesized that the HBM would account for the data better than the qVA and reduce the uncertainties of estimated VABF parameters in single VA tests relative to the qVA. 
Methods
Data
The dataset used in this study included a total of 56 tests: 14 eyes (left and right eyes of 7 subjects), each tested with the qVA with 3 levels of Bangerter foils66,67 and without foil.25 Each qVA test consisted of 45 rows (K = 45) of 3 optotypes (Fig. 4). In each qVA trial, 3 letters on a row, randomly sampled without replacement from the 10 Sloan letters (C, D, H, K, N, O, R, S, V, and Z) and with their size determined by the qVA, were presented to the subject. The subject was instructed to report the identity of the letters verbally. The joint posterior distributions of VABF parameters in each qVA test after k = 5, 10, 15, 20, 25, 30, 35, 40, and 45 rows were used to conduct statistical analysis. The details of the qVA can be found in prior publications.24,25 
Figure 4.
 
A sample stimulus sequence in a qVA test. Stimuli on rows 1, 2, 3, 43, 44, and 45 are shown, with three optotypes in each row.
Figure 4.
 
A sample stimulus sequence in a qVA test. Stimuli on rows 1, 2, 3, 43, 44, and 45 are shown, with three optotypes in each row.
Apparatus
All analysis was conducted on a Dell computer with Intel Xeon W-2145 @ 3.70 GHz CPU (8 cores and 16 threads) and 64 GB installed memory (RAM). The qVA was implemented in Matlab R2013a (MathWorks Corp., Natick, MA, USA) and the HBM was implemented with JAGS68 in R.69 
The HBM
Likelihood Function for a Single Test
The likelihood function of a model is defined as the probability of the observed data given a particular set of parameter values in the model. The VABF (see Fig. 1A) in qVA was used to compute the probability of individual i’s response rijk (the number of correct identifications: 0, 1, 2, or 3) of the 3 optotypes with optotype size osijk in row k of test j:  
\begin{equation}p\left( {{r_{ijk}}{\rm{|}}{\theta _{ij}},o{s_{ijk}}} \right) = f\left( {{\rm{g}}\left( {o{s_{ijk}},{\theta _{ij}}} \right),n} \right),\end{equation}
(1a)
where g(osijkij) is the VABF of correct identification of a single optotype with size osijk and model parameters θij = (αij, βij); f(g(osijk, θij), n) is the probability of observing the number of correct identifications given g(osijk, θij) and a specific chart design (e.g. n = 3 optotypes on a row). The details of functions f and g can be found in Supplementary Materials A and previous publications.24,43 The probability of obtaining the observed responses in all rows in test j from individual i is the product of the probability of responses in all rows (Equation 1a):  
\begin{equation}p\left( {{r_{ij,1:K}}{\rm{|}}{\theta _{ij}},o{s_{ij,1:K}}} \right) = \mathop \prod \limits_{k = 1}^K p\left( {{r_{ijk}}{\rm{|}}{\theta _{ij}},o{s_{ijk}}} \right).\end{equation}
(1b)
 
Three-Level Hierarchy
To account for the entire dataset with all the individuals and tests, we constructed an HBM with three levels (see Fig. 3). In this model, the population level distribution of VABF hyperparameters sets the prior for the mean hyperparameters of each individual, which in turn sets the prior of the parameters of the individual in all her/his tests. We describe the distributions at these three levels in more details below. 
At the population level, p(η), the joint distribution of hyperparameter η, is modeled as a mixture of two-dimensional Gaussian distributions \({\cal N}\) with mean µ and covariance Σ, which have distributions p(µ) and p(Σ):  
\begin{equation}p\left( \eta \right) = {\cal N}\left( {\eta ,\mu ,\Sigma } \right)p\left( \mu \right)p\left( \Sigma \right).\end{equation}
(2)
 
At the individual level, pi|η), the joint distribution of hyperparameter τi of individual i, is modeled as a mixture of two-dimensional Gaussian distributions \({\cal N}\) with mean ρi and covariance φ, which have distributions pi|η) and p(φ):  
\begin{equation}p\left( {{\tau _i}{\rm{|}}\eta } \right) = {\cal N}\left( {{\tau _i},{\rho _i},\phi } \right)p\left( {{\rho _i}|\eta } \right)p\left( \phi \right),\end{equation}
(3)
where pi|η) denotes that ρi is conditioned on η. 
At the test level, piji), the joint distribution of parameter θij is conditioned on τi
The probability of obtaining the entire dataset is computed by probability multiplication:  
\begin{equation}\begin{array}{@{}l@{}} p\left( {{r_{1:I,1:J,1:K}}{\rm{|}}X,o{s_{1:I,1:J,1:K}}} \right) =\\ \mathop \prod \limits_{i = 1}^I \mathop \prod \nolimits_{j = 1}^J p\left( {{r_{ij,1:K}}{\rm{|}}{\theta _{ij}},o{s_{ij,1:K}}} \right)p\left( {{\theta _{ij}}{\rm{|}}{\tau _i}} \right)p\left( {{\tau _i}{\rm{|}}\eta } \right)p\left( \eta \right)\\ = \mathop \prod \limits_{i = 1}^I \mathop \prod \nolimits_{j = 1}^J p\left( {{r_{ij,1:K}}{\rm{|}}{\theta _{ij}},o{s_{ij,1:K}}} \right)p\left( {{\theta _{ij}}{\rm{|}}{\tau _i}} \right){\cal N}\left( {{\tau _i},{\rho _i},\phi } \right)\\p\left( {{\rho _i}|\eta } \right)p\left( \phi \right){\cal N}\left( {\eta ,\mu ,\Sigma } \right)p\left( \mu \right)p\left( \Sigma \right), \end{array}\end{equation}
(4)
where X = (θ1:I, 1:J, ρ1:I, φ, µ, Σ) are all the parameters and hyperparameters in the HBM, I is the total number of individuals, and J is the total number of tests on each individual. 
Bayesian Inference
We start with prior distributions of µ, Σ, and φ. 
\begin{equation}{p_0}\left( \mu \right) = {\cal U}\left( {{\mu _{0,min}},{\mu _{0,max}}} \right),\end{equation}
(5a)
 
\begin{equation}{p_0}\left( {{\Sigma ^{ - 1}}} \right) = {\cal W}\left( {\Sigma _{qVA}^{ - 1}/{\rm{\nu }},{\rm{\nu }}} \right),\end{equation}
(5b)
 
\begin{equation}{p_0}\left( {{\phi ^{ - 1}}} \right) = {\cal W}\left( {\phi _{qVA}^{ - 1}/{\rm{\nu }},{\rm{\nu }}} \right),\end{equation}
(5c)
where \({\cal U}\) is a two-dimensional uniform distribution with µ0,min = (−0.5, 0.1) logMAR and µ0,max = (1.3, 1.5) logMAR; precision matrices φ−1 and Σ−1 are the inverse of covariance φ and Σ; \({\cal W}( {\frac{{\rm{Y}}}{{\rm{\nu }}},{\rm{\nu }}} )\) denotes a Wishart distribution with expected mean precision matrix Y and degrees of freedom ν = 2; \(\phi _{qVA}^{ - 1}\) is the inverse of the average covariance matrix φqVA computed from the joint posterior distributions across all qVA tests in the dataset; ΣqVA−1, is the inverse of the covariance matrix ΣqVA computed from the estimated θij across all qVA tests. 
Bayes’ rule is used to compute the joint posterior distribution of all the parameters and hyperparameters in the HBM:  
\begin{eqnarray}\!\!\!\begin{array}{l}p\left( {X{\rm{|}}{r_{1:I,1:J,1:K}},o{s_{1:I,1:J,1:K}}} \right) = \\\frac{{\mathop \prod \nolimits_{i = 1}^I \mathop \prod \nolimits_{j = 1}^J p\left( {{r_{ij,1:K}}{\rm{|}}{\theta _{ij}},o{s_{ij,1:K}}} \right)p\left( {{\theta _{ij}}{\rm{|}}{\tau _i}} \right){\cal N}\left( {{\tau _i},{\rho _i},\phi } \right)p\left( {{\rho _i}|\eta } \right){p_0}\left( \phi \right){\cal N}\left( {\eta ,\mu ,\Sigma } \right){p_0}\left( \mu \right){p_0}\left( \Sigma \right)}}{{\smallint \mathop \prod \nolimits_{i = 1}^I \mathop \prod \nolimits_{j = 1}^J p\left( {{r_{ij,1:K}}{\rm{|}}{\theta _{ij}},o{s_{ij,1:K}}} \right)p\left( {{\theta _{ij}}{\rm{|}}{\tau _i}} \right){\cal N}\left( {{\tau _i},{\rho _i},\phi } \right)p\left( {{\rho _i}|\eta } \right){p_0}\left( \phi \right){\cal N}\left( {\eta ,\mu ,\Sigma } \right){p_0}\left( \mu \right){p_0}\left( \Sigma \right)dX\;}},\end{array}\end{eqnarray}
(6)
where the denominator is the integral of the probability of obtaining the entire dataset (Equation 4) across all possible values of X and is a constant for a given dataset and HBM. 
Computing the Joint Posterior Distribution
In this study, an individual refers to a combination of subject, eye and foil level, with the total number of individuals I = 56. In addition, each individual was only tested once (J = 1). 
The complexity of the HBM increases exponentially with the number of hyperparameters and parameters. It would take practically infinite time to compute all the points in the 232-dimensional hyperparameter/parameter space defined in Equation 6. We used the JAGS68 package in R69 to evaluate the joint posterior distribution (see Equation 6). JAGS uses an efficient sampling algorithm to generate representative samples of the joint posterior distribution of all the parameters and hyperparameters in the HBM via Markov Chain Monte Carlo (MCMC). Each MCMC generated 15,000 samples via a random walk process (see Supplementary Materials B for details). Because the initial part of the random walk was largely influenced by the arbitrary starting position, steps in the burn-in and adaptation (further optimization of the sampling algorithm by JAGS) phases were discarded and not included in the analysis. The exact number of steps discarded differ for each model. In this study, 20,000 and 100,000 steps were used for burn-in and adaptation in each HBM model based on results from pilot studies. Convergence of each parameter was evaluated with Gelman and Rubin's diagnostic rule70 based on the ratio of between- and within-MCMC variances along each dimension, that is, the variance of the samples across MCMC processes divided by the variance of the samples in each MCMC process. The HBM was deemed “converged” when the ratios for all the parameters were smaller than 1.05. The data with k = 5, 10, 15, 20, 25, 30, 35, 40, and 45 rows were fit by the HBM separately. 
Simulations
Each simulated dataset consisted of 56 qVA tests from 56 individuals, each tested once. The VABF parameters for the simulated observers were randomly sampled from the posterior distributions of θi1 obtained from the HBM fit to the real data. Each qVA test consisted of 45 three-optotype rows, and the row-by-row responses were determined by the VABF specified with the parameters of the simulated observer. The HBM was fit to the simulated dataset of 56 tests. The procedure was repeated 10 times. 
Statistical Analysis
Goodness-of-Fit
Bayesian predictive information criterion (BPIC)71,72 was used to quantify and compare the goodness-of-fit of the HBM against that of the qVA. The BPIC quantifies the likelihood of the data based on the joint posterior distribution of the parameters of a model and penalizes model complexity. 
Uncertainty
We compared the uncertainties of estimated θi1 from the qVA and HBM. The uncertainty of each estimated parameter was quantified by the 68.2% HWCI of its marginal posterior distribution.26,27 Paired t-test (R function t.test69) was used to compare the 68.2% HWCIs between the two methods. 
Discrimination Accuracy
To demonstrate effects of uncertainty reduction, we computed the accuracy in detecting a 0.15 logMAR VA threshold (αi1) change, a 0.15 logMAR range (βi1) change, and a 0.15 logMAR change of both parameters. We chose 0.15 logMAR as the magnitude of αi1 change because a greater-than-15-letter VA improvement (0.3 logMAR) is considered by the US Food & Drug Administration (FDA) as an acceptable end point of a clinical trial, although a greater-than-10-letter VA improvement (0.2 logMAR) has been used when the benefits can outweigh the safety risks of the proposed method or product.73 We chose 0.15 logMAR as the magnitude of βij change, because a similar change has been reported in eyes with degraded vision.18,25 
Discrimination accuracy was quantified by the area under receiver operating characteristic curve (AUROC). Given a change-criterion, specificity is defined as the probability of correctly identifying no change, whereas sensitivity is defined as the probability of correctly identifying a change.74 Whereas sensitivity and specificity depend on the change criterion, the AUROC in signal detection theory provides a criterion-free estimate of the accuracy of a test to detect a change.74 It only depends on the discriminability \(d^{\prime}\) that weighs the evidence against uncertainty and quantifies the signal (change) to noise (HWCI) ratio of two probability distributions of the estimates in two conditions. 
For one-dimensional distributions, \(d^{\prime}\) is defined as74:  
\begin{equation}d^{\prime} = \frac{\Delta }{{{\rm{HWCI}}}},\end{equation}
(7)
where Δ  is the magnitude of change. For multidimensional difference distributions, \(d^{\prime}\) is defined as75:  
\begin{equation}d^{\prime} = \sqrt {\Delta *cov{{\left( \Delta \right)}^{ - 1}}*{\Delta ^T}} ,\end{equation}
(8a)
 
\begin{equation}cov\left( \Delta \right) = {\rm{\;}}\left[ {\begin{array}{@{}*{2}{c}@{}} {{\rm{HWCI}}_{threshold}^2}&{covariance}\\ {covariance}&{{\rm{HWCI}}_{range}^2} \end{array}} \right],\end{equation}
(8b)
 
\begin{eqnarray*}covariance = corr \times {\rm{HWC}}{{\rm{I}}_{threshold}} \times {\rm{HWC}}{{\rm{I}}_{threshold}}\end{eqnarray*}
where Δ = (0.15,  0.15), cov(Δ) is the covariance matrix of the VABF parameter difference distribution, cov(Δ)−1 is the inverse of cov(Δ), ΔT is the transpose of Δ, * represents matrix multiplication, and corr is the correlation coefficient at the test level in the HBM. 
Results
Goodness-of-Fit
The BPIC for the qVA and HBM were 4867 and 4832, respectively, indicating that the HBM fit the data better than the qVA. Figure 5 shows the posterior distributions of the estimated VABFs in one test from the qVA and HBM after 5, 15, and 45 rows. Although the HBM fit the data better and reduced the uncertainty of the estimated θ1: I, 1, it did not significantly change the expected values of the estimated θ1:I, 1: the average absolute differences of the estimates from the qVA and HBM across all individuals were 0.002 and 0.022 logMAR, respectively, indicating that HBM and qVA estimates exhibited virtually the same central tendency. 
Figure 5.
 
Posterior distributions of the estimated VABFs in one test from the qVA (A, B, C) and HBM (D, E, F) after 5, 15, and 45 rows. Open circles represent data.
Figure 5.
 
Posterior distributions of the estimated VABFs in one test from the qVA (A, B, C) and HBM (D, E, F) after 5, 15, and 45 rows. Open circles represent data.
Posterior Distributions From the HBM
Figure 6 illustrates the joint posterior distributions of hyperparameters µ, Σ, and η after 45 rows. Whereas µ and Σ were from the HBM fits to the data (see Equation 6), η was constructed from the distributions of µ and Σ (see Equation 2). The correlation coefficient between η1 and η2 was 0.662. (P < 0.001). Figure 7 illustrates the joint posterior distributions of hyperparameters φ, and ρi, and τi for 8 out of the 56 individuals after 45 rows. Whereas ρi and φ were from the HBM fits to the data (see Equation 6), τi was constructed from the distributions of ρi and φ (see Equation 3). Figure 8A illustrates the joint posterior distributions of parameters θi1 from the 8 corresponding individuals in Figure 7 after 45 rows. Figure 8B shows the mean θi1 of all the 56 tests. 
Figure 6.
 
Illustrations of the joint posterior distributions of hyperparameters µ (A), Σ (B), and η (C) after 45 rows at the population level in the HBM.
Figure 6.
 
Illustrations of the joint posterior distributions of hyperparameters µ (A), Σ (B), and η (C) after 45 rows at the population level in the HBM.
Figure 7.
 
Illustrations of the joint posterior distributions of hyperparameters ρi (A), φ (B), and τi (C) for 8 out of the 56 individuals after 45 rows in the HBM.
Figure 7.
 
Illustrations of the joint posterior distributions of hyperparameters ρi (A), φ (B), and τi (C) for 8 out of the 56 individuals after 45 rows in the HBM.
Figure 8.
 
(A) Illustration of the joint posterior distributions of parameters θi1 from 8 tests of the 8 individuals in Figure 7 after 45 rows in the HBM. (B) Mean θi1, \({\bar \theta _{i1}} = ( {{{\bar \alpha }_{i1}},{{\bar \beta }_{i1}}} )\), (circles) of all the 56 tests, with the 8 tests in A highlighted in red.
Figure 8.
 
(A) Illustration of the joint posterior distributions of parameters θi1 from 8 tests of the 8 individuals in Figure 7 after 45 rows in the HBM. (B) Mean θi1, \({\bar \theta _{i1}} = ( {{{\bar \alpha }_{i1}},{{\bar \beta }_{i1}}} )\), (circles) of all the 56 tests, with the 8 tests in A highlighted in red.
Uncertainty of θi1 Estimates
Figure 9 shows the estimated joint posterior distributions of θi1 after 5, 15, and 45 rows for one individual, obtained from both the qVA and HBM. Across all the individuals and tests, the average 68.2% HWCI of the estimated θi1 decreased with the number of rows in both methods (Fig. 10Table 1). With 5 rows, the 68.2% HWCIs of the estimated αi1 (t(55) = 14.8, P < 0.001) and βi1 (t(55) = 13.7, P < 0.001) were significantly different between the methods. With 45 rows, the 68.2% HWCIs of the estimated αi1 (t(55) = 4.63, P < 0.001) and βi1 (t(55) = 6.74, P < 0.001) were significantly different between the methods. Relative to the qVA, the HBM reduced the HWCI of the estimated αi1 by 4.2% to 26.9%, and βi1 by 20.8% to 45.8%. 
Figure 9.
 
Joint posterior distributions of θi1 for one individual in one qVA test after 5 (A), 15 (B), and 45 (C) rows. Joint posterior distributions of the corresponding θi1 of the same test in the HBM after 5 (D), 15 (E), and 45 (F) rows.
Figure 9.
 
Joint posterior distributions of θi1 for one individual in one qVA test after 5 (A), 15 (B), and 45 (C) rows. Joint posterior distributions of the corresponding θi1 of the same test in the HBM after 5 (D), 15 (E), and 45 (F) rows.
Figure 10.
 
Average 68.2% HWCI (± standard error: dotted lines) of the estimated αi1 (A) and βi1 (B) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Figure 10.
 
Average 68.2% HWCI (± standard error: dotted lines) of the estimated αi1 (A) and βi1 (B) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Table 1.
 
Average 68.2% HWCI (logMAR) of the Estimated αi1 and βi1 in the qVA and HBM
Table 1.
 
Average 68.2% HWCI (logMAR) of the Estimated αi1 and βi1 in the qVA and HBM
Uncertainty of the estimated parameters is quantified as the HWCI of the (marginal) posterior distribution of θi1. Because the trial-by-trial data and distribution of τi were different for each individual and test, the uncertainties were different across individuals and tests in the HBM (Fig. 11). 
Figure 11.
 
Histograms of 68.2% HWCI of the estimated αi1 (A) and βi1 (B) for tests in the qVA (blue) and HBM (red) after 45 rows.
Figure 11.
 
Histograms of 68.2% HWCI of the estimated αi1 (A) and βi1 (B) for tests in the qVA (blue) and HBM (red) after 45 rows.
Simulations
The HBM recovered the parameters of the simulated tests very well. The HBM solutions had near-zero bias for αi1 (−0.0024 logMAR) and βi1 (0.0057 logMAR) across all the simulations. The average root–mean-square errors (RMSEs) between the HBM estimates and the true values (0.019 and 0.047 logMAR for αi1 and βi1) were comparable to the HWCIs of the parameter estimates (0.019 and 0.050 logMAR for αi1 and βi1), indicating good convergence of the HBM fits. In addition, the HWCIs from the simulations were also comparable to those from the experimental data (0.018 and 0.048 logMAR for αi1 and βi1), suggesting that the simulations captured the uncertainties in the real experiment. 
Discrimination Accuracy
Discrimination accuracy increased with the number of rows (Table 2Fig. 12). The HBM estimates exhibited higher accuracy than the qVA, especially when the number of rows was relatively small. To reach 95% accuracy, 10 and 7 rows were needed to detect a 0.15 logMAR αi1 change with the qVA and HBM, and 10 and 6 rows were needed to detect a 0.15 logMAR change of both αi1 and βi1, respectively. 
Table 2.
 
Discrimination Accuracy (%) in Detecting a 0.15 logMAR αi1 Change, a 0.15 logMAR βi1 Change, and a 0.15 logMAR Change of Both Parameters
Table 2.
 
Discrimination Accuracy (%) in Detecting a 0.15 logMAR αi1 Change, a 0.15 logMAR βi1 Change, and a 0.15 logMAR Change of Both Parameters
Figure 12.
 
Discrimination accuracy in detecting a 0.15 logMAR αi1 change (A), a 0.15 logMAR βi1 change (B), and a 0.15 logMAR change of both parameters (C) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Figure 12.
 
Discrimination accuracy in detecting a 0.15 logMAR αi1 change (A), a 0.15 logMAR βi1 change (B), and a 0.15 logMAR change of both parameters (C) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Discussion
In this study, we developed an HBM to improve uncertainty quantification in VA assessment. By explicitly quantifying regularities at the population, individual, and test levels, the HBM utilized knowledge of VABF across multiple assessments to constrain parameter estimates, and decompose the variability of an entire dataset into distributions at three levels. Applying the model to a dataset that consisted of qVA assessments of 14 eyes in 4 Bangerter foil conditions, we showed that the HBM provided better fits to the data than the qVA and reduced the HWCI of the αi1 estimates by 26.9%, 12.2%, and 4.2%, and βi1 estimates by 45.8%, 35.3%, and 20.8% for tests with 5, 15, and 45 rows, respectively. Simulations showed that the HBM recovered αi1 spanning about 0.8 logMAR (−0.15 to approximately 0.68 logMAR) and βi1 spanning about 0.4 logMAR (approximately 0.11 to 0.49 logMAR). The simulation results also provided validity for the discrimination accuracy analysis. 
The HBM exhibited its largest advantage relative to the qVA when the number of tested rows in each qVA test was small (see Table 1). This is because the HBM reduces uncertainty by incorporating knowledge across all individuals and tests. As the number of rows increased, more knowledge in each qVA test became available and provided better constraints on the parameters. The results highlight the importance of the HBM when fewer rows are tested, which is often the case in clinical settings. In fact, the current HBM can be readily used to analyze data from other VA tests, such as the ETDRS (Lu et al. IOVS. 2021;62: ARVO Abstract 3546329). 
The HBM reduced the uncertainty of the estimated θi1 and improved the discrimination accuracy of the tests. The reduction of uncertainty, on average, resulted in 3 fewer rows needed to reach 95% accuracy in detecting changes of 0.15 logMAR in αi1 or both αi1 and βi1 than the qVA (see Table 2Fig. 12). The increased accuracy in detecting βi1 changes in the HBM may enable a new range endpoint in functional vision,5 because it has been documented that poor VA thresholds were accompanied by wider VA ranges.25 In addition, using the joint distribution of θi1 can further increase the accuracy in detecting changes of VA behavior because the two-dimensional distributions may contain more information than the one-dimensional marginal distributions.7678 
In this paper, we constructed a simple three-level HBM without considering any structure related to test conditions because our goal was to quantify the uncertainty of the estimated VABF in each test. The HBM can be extended to model regularities between different experimental conditions (e.g. before and after treatment) with covariance. A positive covariance between conditions can enhance the ability to detect differences between conditions, and therefore improve statistical power in clinical trials (unpublished communication: Zhao Y, Lesmes LA, Hou F, Lu Z-L. Hierarchical Bayesian modeling of contrast sensitivity functions in a within-subject design, 2021). The HBM can also be extended to model repeated tests of the same individual, with parameters θij of all the tests conditioned on τi
Without explicit consideration of the full experimental design (e.g. two eyes from each subject), the current HBM may under- or overestimate the uncertainty from the tests, depending on the nature of the correlation (positive or negative) and experiment design. To check the impact of the simplification, we ran 50 iterations of the HBM on the data from one randomly selected eye of the 7 subjects in 4 conditions. The procedure eliminated the correlation between eyes. We obtained virtually the same estimated θi1 as the original HBM fit to the 14 eyes, with slightly higher HWCIs of the estimated βi,1 (0.018 ± 0.00082 and 0.050 ± 0.00184 logMAR for αi1 and βi1 in the 7 eye fits, versus 0.018 and 0.048 logMAR in the 14 eye fit). The RMSEs between the HBM fits to the 7 and 14 eyes were 0.00078 and 0.00793 logMAR for αi1 and βi1, and 0.00036 and 0.00350 logMAR for the corresponding HWCIs. The results also suggest that the original experimental dataset with 14 eyes tested in 4 conditions was sufficient to constrain the HBM because cutting the amount of data to half generated virtually the same estimates. 
Uncertainty in assessing the probabilistic VA behavior is determined by three factors: prior knowledge of the VA behavior, data collected in the assessment, and data analysis method. We developed the HBM as a data analysis tool in this study. The posterior distributions of the hyperparameters η and τi at the population and individual levels from the HBM can also be used to construct informative priors (see Figs. 6C, 7C) within the hierarchical adaptive design optimization (HADO) framework79,80 for new individuals (see Fig. 6C) and repeated tests of the same individual (see Fig. 7C), respectively. In the HADO framework, the joint distribution in the HBM is updated after each test and used as an informative prior in the next test to improve test efficiency for new individuals or repeated tests of the same individual. The 6-hour computation time of the HBM on a desktop computer makes it realistic to incorporate it within the HADO framework with daily updates of the joint posterior distribution. 
Conclusion
We developed a three-level HBM to utilize knowledge of VA behavior across multiple individuals and tests by explicitly modeling cross- and within-individual covariances. The HBM reduced the uncertainties of the estimated VABF parameters, especially when the number of tested rows was relatively small. We applied the HBM to a qVA dataset of normal healthy subjects with simulated visual degradations. The benefit of the HBM in real clinical populations needs further research. 
Acknowledgments
Supported by the National Eye Institute (EY021553 and EY017491). L.A.L. has an intellectual property interest in methods for quantitative visual acuity testing (US10758120B2). In addition, L.A.L., M.D., and Z.L.L. have intellectual property interests in methods for measuring and applying contrast sensitivity functions (US 7938538, WO2013170091, and PCT/US2015/028657), and equity interest in Adaptive Sensory Technology, Inc. (San Diego, CA, USA). L.A.L. and M.D. hold employment in AST. 
Disclosure: Y. Zhao, None; L.A. Lesmes, (I, E, P); M. Dorr, (I, E, P); Z.-L. Lu, (I, P) 
References
Kniestedt C, Stamper RL. Visual acuity and its measurement. Ophthalmol Clin North Am. 2003; 16: 155–170. [CrossRef] [PubMed]
Clare G, Pitts JA, Edgington K, Allan BD. From beach lifeguard to astronaut: Occupational vision standards and the implications of refractive surgery. Br J Ophthalmol. 2010; 94: 400–405. [CrossRef] [PubMed]
Kiel AW, Butler T, Alwitry A. Visual acuity and legal visual requirement to drive a passenger vehicle. Eye. 2003; 17(5): 579–582. [CrossRef]
Levenson JH, Kozarsky A. Visual Acuity. In: Walker HK, Hall WD, & Hurst JW, eds. Clinical Methods: The History, Physical, and Laboratory Examinations (3rd ed.). Boston, US: Butterworths; 1990.
Rahi JS, Cumberland PM, Peckham C.S. Does amblyopia affect educational, health, and social outcomes? Findings from 1958 British birth cohort. Br Med J. 2006; 332(7545): 820–824. [CrossRef]
National Research Council (US) Committee on Disability Determination for Individuals with Visual Impairments; Lennie P, Van Hemel SB, editors. Washington, DC: National Academies Press (US); 2002.
National Institute for Health and Care Excellence. UK National Institute for Health and Care Excellence. Age-related macular degeneration NICE guideline (NG82). January 2018. Available at: https://www.nice.org.uk/guidance/ng82.
Bailey I, Bullimore M, Raasch T, Taylor H. Clinical grading and the effects of scaling. Invest Ophthalmol Vis Sci. 1991; 32: 422–432. [PubMed]
Carkeet A. Modeling logMAR visual acuity scores: Effects of termination rules and alternative forced-choice options. Optom Vis Sci. 2001; 78: 529–538. [CrossRef] [PubMed]
Loviekitchin J. Validity and reliability of visual-acuity measurements. Ophthalmic Physiol Opt. 1988; 8: 363–370. [CrossRef] [PubMed]
Raasch TW, Bailey IL, Bullimore MA. Repeatability of visual acuity measurement. Optom Vis Sci. 1998; 75: 342–348. [CrossRef] [PubMed]
Barlow H. Retinal noise and absolute threshold. J Optical Society of Am. 1956; 46: 634–639. [CrossRef]
Faisal AA, Selen LP J, Wolpert DM. Noise in the nervous system. Nat Rev Neurosci. 2008; 9: 292–303. [CrossRef] [PubMed]
Lu Z-L, Dosher BA. Characterizing observers using external noise and observer models: Assessing internal representations with external noise. Psychol. Rev. 2008; 115: 44–82. [CrossRef] [PubMed]
Neri P. How inherently noisy is human sensory processing? Psychon Bull Rev. 2010; 17: 802–808. [CrossRef] [PubMed]
Alexander R, Xie W, Derlacki DJ. Visual acuity and contrast sensitivity for individual Sloan letters. Vision Res. 1997; 37: 813–819. [CrossRef] [PubMed]
Carkeet A, Lee L, Kerr JR, Keung MM. The slope of the psychometric function for Bailey-Lovie letter charts: Defocus effects and implications for modeling letter-by-letter scores. Optom Vis Sci. 2001; 78(2): 113–121. [CrossRef] [PubMed]
Horner DG, Paul AD, Katz B, Bedell HE. Variations in the slope of the psychometric acuity function with acuity threshold and scale. J Optom. 1985; 62: 895–900. [CrossRef]
Strasburger H. Converting between measures of slope of the psychometric function. Percept Psychophys. 2001; 63: 1348–1355. [CrossRef] [PubMed]
Bach M. The Freiburg visual acuity test-variability unchanged by post-hoc re-analysis. Graefes Arch Clin Exp Ophthalmol. 2007; 245: 965–971. [CrossRef] [PubMed]
Carkeet A, Bailey IL. Slope of psychometric functions and termination rule analysis for low contrast acuity charts. Ophthalmic Physiol Opt. 2017; 37(2): 118–127. [CrossRef] [PubMed]
Hazel CA, Elliott DB. The dependency of LogMAR visual acuity measurements on chart design and scoring rule. Optom Vis Sci. 2002; 79(12): 788–792. [CrossRef] [PubMed]
Lesmes LA, Dorr M. Active learning for visual acuity testing. In: Proceedings of the 2nd International Conference on Applications of Intelligent Systems. APPIS ’19. New York, NY: Association for Computing Machinery; 2019: 1–6.
Zhao Y, Lesmes LA, Dorr M, Bex PJ, Lu Z-L. Psychophysical validation of a novel active learning approach for measuring the visual acuity behavioral function. Trans Vis Sci Tech. 2021; 10: 1. [CrossRef]
Lu Z-L, Dosher BA. Visual Psychophysics: From Laboratory to Theory. Cambridge, MA: MIT Press; 2013.
Clayton D, Hills M. Statistical models in epidemiology. Oxford, UK: Oxford University Press; 1993.
Edwards W, Lindman H, Savage LJ. Bayesian statistical inference for psychological research. Psychol Rev. 1963; 70(3): 193–242. [CrossRef]
Wichmann FA, Hill NJ. The psychometric function: I. Fitting, sampling, and goodness of fit. Percept Psychophys. 2001; 63: 1293–1313. [CrossRef] [PubMed]
Wichmann FA, Hill NJ. The psychometric function: II. Bootstrap-based confidence intervals and sampling. Percept Psychophys. 2001; 63: 1314–1329. [CrossRef] [PubMed]
Hou F, Lesmes L, Bex P, Dorr M, Lu Z-L. Using 10AFC to further improve the efficiency of the quick CSF method. J Vis. 2015; 15: 2. [CrossRef] [PubMed]
Bailey IL, Lovie JE. New design principles for visual acuity letter charts. Am J Optom Physiol Opt. 1976; 53: 740–745. [CrossRef] [PubMed]
Beck RW, Moke PS, Turpin AH, et al. A computerized method of visual acuity testing: Adaptation of the early treatment of diabetic retinopathy study testing protocol. Am J Ophthalmol. 2003; 135: 194–205. [CrossRef] [PubMed]
Ferris FL, Kassoff A, Bresnick GH, Bailey I. New visual acuity charts for clinical research. Am J Ophthalmol. 1982; 94: 91–96. [CrossRef] [PubMed]
Bach M. The Freiburg visual acuity test–automatic measurement of visual acuity. Optom Vis Sci. 1996; 73: 49–53. [CrossRef] [PubMed]
Arditi A, Cagenello R. On the statistical reliability of letter-chart visual acuity measurements. Invest Ophthalmol Vis Sci. 1993; 34: 120–129. [PubMed]
Bokinni Y, Shah N, Maguire O, Laidlaw DAH. Performance of a computerised visual acuity measurement device in subjects with age-related macular degeneration: comparison with gold standard ETDRS chart measurements. Eye. 2015; 29: 1085–109. [CrossRef] [PubMed]
Elliott DB, Sheridan M. The use of accurate visual-acuity measurements in clinical anti-cataract formulation trials. Ophthalmic Physiol Opt. 1988; 8: 397–401. [CrossRef] [PubMed]
Laidlaw DAH, Abbott A, Rosser DA. Development of a clinically feasible logMAR alternative to the Snellen chart: Performance of the “‘compact reduced logMAR’” visual acuity chart in amblyopic children. Br J Ophthalmol. 2003; 87(10): 1232–1234. [CrossRef] [PubMed]
Laidlaw DAH, Tailor V, Shah N, Atamian S, Harcourt C. Validation of a computerised logMAR visual acuity measurement system (COMPlog): comparison with ETDRS and the electronic ETDRS testing algorithm in adults and amblyopic children. Br J Ophthalmol. 2008; 92: 241–244. [CrossRef] [PubMed]
Patel PJ, Chen FK, Rubin GS, Tufail A. Intersession repeatability of visual acuity scores in age-related macular degeneration. Invest Ophthalmol Vis Sci. 2008; 49: 4347–4352. [CrossRef] [PubMed]
Rosser DA, Laidlaw DAH, Murdoch IE. The development of a “reduced logMAR” visual acuity chart for use in routine clinical practice. Br J Ophthalmol. 2001; 85: 432–436. [CrossRef] [PubMed]
Vanden Bosch ME, Wall M. Visual acuity scored by the letter-by-letter or probit methods has lower retest variability than the line assignment method. Eye. 1997; 11: 411–417. [CrossRef] [PubMed]
Daniels M J, Kass RE. Nonconjugate Bayesian estimation of covariance matrices and its use in hierarchical models. J Am Stat Assoc. 1999; 94(448): 1254 [CrossRef]
Klotzke K, Fox J-P. Bayesian covariance structure modeling of responses and process data. Front Psychol. 2019; 10: 1675. [CrossRef] [PubMed]
Thall PF, Wathen JK, Bekele BN, Champlin RE, Baker LH, Benjamin RS. Hierarchical Bayesian approaches to phase II trials in diseases with multiple subtypes. Stat Med. 2003; 22(5): 763–780. [CrossRef] [PubMed]
Wang C, Lin X, Nelson KP. Bayesian hierarchical latent class models for estimating diagnostic accuracy. Stat Methods Med Res. 2020; 29(4): 1112–1128. [CrossRef] [PubMed]
Yang J, Zhu H, Choi T, Cox DD. (2016). Smoothing and mean-covariance estimation of functional data with a Bayesian hierarchical model. Bayesian Anal. 2011; 11(3): 649–670.
Kruschke JK. Doing Bayesian data analysis: a tutorial with R, JAGS, and Stan. San Diego, CA: Academic Press; 2014.
Lee MD. A hierarchical Bayesian model of human decision-making on an optimal stopping problem. Cogn Sci. 2006; 30(3): 1–26. [CrossRef] [PubMed]
Lee MD. How cognitive modeling can benefit from hierarchical Bayesian models. J Math Psychol. 2011; 55(1): 1–7. [CrossRef]
Rouder JN, Lu J. An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychon Bull Rev. 2005; 12(4): 573–604 [CrossRef] [PubMed]
Wilson JD, Cranmer S, Lu Z-L. A hierarchical latent space network model for population studies of functional connectivity. Comput Brain Behav. 2020; 3: 384–399. [CrossRef]
Klotzke K, Fox J-P. Bayesian covariance structure modeling of responses and process data. Front Psychol. 2019; 10: 1675. [CrossRef] [PubMed]
Lee MD. Three case studies in the Bayesian analysis of cognitive models. Psychon Bull Rev. 2008; 15(1): 1–15. [CrossRef] [PubMed]
Thrane E, Talbot C. An introduction to Bayesian inference in gravitational-wave astronomy: Parameter estimation, model selection, and hierarchical models. Publ Astron Soc Aust. 2019; 36: e010. [CrossRef]
Reum JCP, Hovel RA, Greene C.M. Estimating continuous body size-based shifts in delta N-15-Delta C-13 space using multivariate hierarchical models. Mar Biol. 2015; 162(2): 469–478. [CrossRef]
Wikle CK. Hierarchical Bayesian models for predicting the spread of ecological processes. Ecology. 2003; 84(6): 1382–1394. [CrossRef]
Storz JF, Beaumont MA. Testing for genetic evidence of population expansion and contraction: An empirical analysis of microsatellite DNA variation using a hierarchical Bayesian model. Evolution. 2002; 56(1): 154–166. [CrossRef] [PubMed]
Ahn W-Y, Krawitz A, Kim W, Busmeyer JR, Brown JW. A model-based FMRI analysis with hierarchical Bayesian parameter estimation. J Neurosci Psychol Econ. 2011; 4(2): 95–110. [CrossRef] [PubMed]
Lee TS, Mumford D. Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A Opt Image Sci Vis. 2003; 20(7): 1434–1448. [CrossRef] [PubMed]
Merkle EC, Smithson M, Verkuilen J. Hierarchical models of simple mechanisms underlying confidence in decision making. J Math Psychol. 2011; 55(1): 57–67. [CrossRef]
Molloy MF, Bahg G, Li X, Steyvers M, Lu Z-L, Turner BM. Hierarchical Bayesian analyses for modeling BOLD time series data. Comput Brain Behav. 2018; 1(2): 184–213. [CrossRef]
Molloy MF, Bahg G, Lu Z-L, Turner BM. Individual differences in the neural dynamics of response inhibition. J Cogn Neurosci. 2019; 31(12): 1976–1996. [CrossRef] [PubMed]
Rouder JN, Sun DC, Speckman PL, Lu J, Zhou D. A hierarchical Bayesian statistical framework for response time distributions. Psychometrika. 2003; 68(4): 589–606. [CrossRef]
Pérez GM, Archer SM, Artal P. Optical characterization of Bangerter foils. Invest Ophthalmol Vis Sci. 2010; 51: 609–613. [CrossRef] [PubMed]
Odell NV, Leske DA, Hatt SR, Adams WE, Holmes JM. The effect of Bangerter filters on optotype acuity, Vernier acuity, and contrast sensitivity. J AAPOS. 2008; 12(6): 555–559. [CrossRef] [PubMed]
Plummer M. JAGS : A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd international workshop on distributed statistical computing. 2003. Available at: https://www.r-project.org/nosvn/conferences/DSC-2003/Drafts/Plummer.pdf.
R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2003. Available at: https://www.R-project.org/.
Gelman A., & Rubin D. B. Inference from iterative simulation using multiple sequences, Stat Sci. 1992; 7: 457–511.
Ando T. Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika. 2007; 94: 443–458. [CrossRef]
Ando T. Predictive Bayesian model selection. Am J Math Manag Sci. 2011; 31: 13–38.
Csaky K, Ferris F, Chew EY, Nair P, Cheetham JK, Duncan JL. Report from the NEI/FDA endpoints workshop on age-related macular degeneration and inherited retinal diseases. Invest Ophthalmol Vis Sci. 2017; 58(9): 3456–3463. [CrossRef] [PubMed]
Green DM, Swets JA. Signal Detection Theory and Psychophysics. New York, NY: John Wiley & Sons; 1966.
Ashby FG, Townsend JT. Varieties of perceptual independence. Psychol Rev. 1986; 93(2): 154–179. [CrossRef] [PubMed]
Borm GF, Fransen J, Lemmens WAJG. A simple sample size formula for analysis of covariance in randomized clinical trials. J Clin Epidemiol. 2007; 60(12): 1234–1238. [CrossRef] [PubMed]
Egbewale BE, Lewis M, Bias Sim J., Precision and statistical power of analysis of covariance in the analysis of randomized trials with baseline imbalance: A simulation study. BMC Med Res Methodol. 2014; 14: 49. [CrossRef] [PubMed]
Wilcox R. Modern Statistics for the Social and Behavioral Sciences: A Practical Introduction (pp. 101–102). Boca Raton, FL: CRC Press; 2012.
Gu H, Kim W, Hou F, et al. A hierarchical Bayesian approach to adaptive vision testing: A case study with the contrast sensitivity function. J Vis. 2016; 16(6): 15. [CrossRef] [PubMed]
Kim W, Pitt MA, Lu Z-L, Steyvers M, Myung JI. A hierarchical adaptive approach to optimal experimental design. Neural Comput. 2014; 26(11): 2465–2492. [CrossRef] [PubMed]
Figure 1.
 
VA behavioral function (VABF) and uncertainty in VA assessments. (A) A VABF with two parameters: VA threshold and range. (B) Estimated VABFs from repeated ETDRS tests, which sample the theoretical VABF with a number of optotype sizes, each tested five times (circles). (C) A color density plot of the point VABF parameter estimates across repeated ETDRS tests. (D) Data (circles) from one ETDRS assessment and the best fitting VABF (curve). (E) Probability distribution of the estimated VABFs from D. (F) The joint probability distribution of the estimated VABF parameters from D.
Figure 1.
 
VA behavioral function (VABF) and uncertainty in VA assessments. (A) A VABF with two parameters: VA threshold and range. (B) Estimated VABFs from repeated ETDRS tests, which sample the theoretical VABF with a number of optotype sizes, each tested five times (circles). (C) A color density plot of the point VABF parameter estimates across repeated ETDRS tests. (D) Data (circles) from one ETDRS assessment and the best fitting VABF (curve). (E) Probability distribution of the estimated VABFs from D. (F) The joint probability distribution of the estimated VABF parameters from D.
Figure 2.
 
Evolution of the joint distributions of VABF parameter θij of individual i in test j, from prior (A) to posterior probability distributions after 15 (B), and 45 (C) rows in a qVA test.
Figure 2.
 
Evolution of the joint distributions of VABF parameter θij of individual i in test j, from prior (A) to posterior probability distributions after 15 (B), and 45 (C) rows in a qVA test.
Figure 3.
 
(A) The qVA models the VABF of individual i in test j with two parameters: VA threshold αij and range βij, θij = (αij, βij). (B) A three-level hierarchical Bayesian model (HBM) of the VABF across multiple individuals and tests. At the population level, µ and Σ are the mean and covariance hyperparameters of the population. At the individual level ρi and φ are the mean and covariance hyperparameters of individual i. At the test level, θij is the VA threshold and range of individual i in test j. The same generative model in the qVA, the VABF (θij), determines the likelihood of response rijk of individual i in trial k of test j.
Figure 3.
 
(A) The qVA models the VABF of individual i in test j with two parameters: VA threshold αij and range βij, θij = (αij, βij). (B) A three-level hierarchical Bayesian model (HBM) of the VABF across multiple individuals and tests. At the population level, µ and Σ are the mean and covariance hyperparameters of the population. At the individual level ρi and φ are the mean and covariance hyperparameters of individual i. At the test level, θij is the VA threshold and range of individual i in test j. The same generative model in the qVA, the VABF (θij), determines the likelihood of response rijk of individual i in trial k of test j.
Figure 4.
 
A sample stimulus sequence in a qVA test. Stimuli on rows 1, 2, 3, 43, 44, and 45 are shown, with three optotypes in each row.
Figure 4.
 
A sample stimulus sequence in a qVA test. Stimuli on rows 1, 2, 3, 43, 44, and 45 are shown, with three optotypes in each row.
Figure 5.
 
Posterior distributions of the estimated VABFs in one test from the qVA (A, B, C) and HBM (D, E, F) after 5, 15, and 45 rows. Open circles represent data.
Figure 5.
 
Posterior distributions of the estimated VABFs in one test from the qVA (A, B, C) and HBM (D, E, F) after 5, 15, and 45 rows. Open circles represent data.
Figure 6.
 
Illustrations of the joint posterior distributions of hyperparameters µ (A), Σ (B), and η (C) after 45 rows at the population level in the HBM.
Figure 6.
 
Illustrations of the joint posterior distributions of hyperparameters µ (A), Σ (B), and η (C) after 45 rows at the population level in the HBM.
Figure 7.
 
Illustrations of the joint posterior distributions of hyperparameters ρi (A), φ (B), and τi (C) for 8 out of the 56 individuals after 45 rows in the HBM.
Figure 7.
 
Illustrations of the joint posterior distributions of hyperparameters ρi (A), φ (B), and τi (C) for 8 out of the 56 individuals after 45 rows in the HBM.
Figure 8.
 
(A) Illustration of the joint posterior distributions of parameters θi1 from 8 tests of the 8 individuals in Figure 7 after 45 rows in the HBM. (B) Mean θi1, \({\bar \theta _{i1}} = ( {{{\bar \alpha }_{i1}},{{\bar \beta }_{i1}}} )\), (circles) of all the 56 tests, with the 8 tests in A highlighted in red.
Figure 8.
 
(A) Illustration of the joint posterior distributions of parameters θi1 from 8 tests of the 8 individuals in Figure 7 after 45 rows in the HBM. (B) Mean θi1, \({\bar \theta _{i1}} = ( {{{\bar \alpha }_{i1}},{{\bar \beta }_{i1}}} )\), (circles) of all the 56 tests, with the 8 tests in A highlighted in red.
Figure 9.
 
Joint posterior distributions of θi1 for one individual in one qVA test after 5 (A), 15 (B), and 45 (C) rows. Joint posterior distributions of the corresponding θi1 of the same test in the HBM after 5 (D), 15 (E), and 45 (F) rows.
Figure 9.
 
Joint posterior distributions of θi1 for one individual in one qVA test after 5 (A), 15 (B), and 45 (C) rows. Joint posterior distributions of the corresponding θi1 of the same test in the HBM after 5 (D), 15 (E), and 45 (F) rows.
Figure 10.
 
Average 68.2% HWCI (± standard error: dotted lines) of the estimated αi1 (A) and βi1 (B) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Figure 10.
 
Average 68.2% HWCI (± standard error: dotted lines) of the estimated αi1 (A) and βi1 (B) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Figure 11.
 
Histograms of 68.2% HWCI of the estimated αi1 (A) and βi1 (B) for tests in the qVA (blue) and HBM (red) after 45 rows.
Figure 11.
 
Histograms of 68.2% HWCI of the estimated αi1 (A) and βi1 (B) for tests in the qVA (blue) and HBM (red) after 45 rows.
Figure 12.
 
Discrimination accuracy in detecting a 0.15 logMAR αi1 change (A), a 0.15 logMAR βi1 change (B), and a 0.15 logMAR change of both parameters (C) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Figure 12.
 
Discrimination accuracy in detecting a 0.15 logMAR αi1 change (A), a 0.15 logMAR βi1 change (B), and a 0.15 logMAR change of both parameters (C) as functions of number of tested rows in the qVA (open squares) and HBM (filled circles).
Table 1.
 
Average 68.2% HWCI (logMAR) of the Estimated αi1 and βi1 in the qVA and HBM
Table 1.
 
Average 68.2% HWCI (logMAR) of the Estimated αi1 and βi1 in the qVA and HBM
Table 2.
 
Discrimination Accuracy (%) in Detecting a 0.15 logMAR αi1 Change, a 0.15 logMAR βi1 Change, and a 0.15 logMAR Change of Both Parameters
Table 2.
 
Discrimination Accuracy (%) in Detecting a 0.15 logMAR αi1 Change, a 0.15 logMAR βi1 Change, and a 0.15 logMAR Change of Both Parameters
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×