The contrast sensitivity function (CSF) is an important measure of spatial vision,
1–3 offering valuable insights into changes in vision associated with the progression of eye diseases and their treatment.
4–9 Various diseases and different stages within a single disease can impact contrast sensitivity (CS) at different spatial frequencies (SF). For instance, glaucoma is particularly known for impairing CS at low SFs,
10 whereas conditions like age-related macular degeneration (AMD),
11–13 cataracts
14,15 (but see reference
16), diabetic retinopathy,
17,18 and idiopathic epiretinal membrane
19 tend to affect CS predominantly at low-to-intermediate SFs. Amblyopia,
20,21 on the other hand, causes CS impairments at intermediate-to-high SFs. Studies have also demonstrated that AMD,
22 inherited retinal degeneration,
23 macula-off retinal detachment,
24 multiple sclerosis,
25,26 and myopia
27–31 can cause CS impairments across low, intermediate, and high SFs. Consequently, to accurately characterize and monitor disease progression for individual patients in clinical settings and to quantify treatment efficacy for specific treatments in clinical trials, it is imperative to assess CS at multiple SFs. Evaluating CS changes across multiple SFs is crucial for its consideration as a safety and efficacy endpoint in clinical trials.
32,33
Adaptive procedures have traditionally been used to measure CS independently across a predetermined set of SFs in laboratory settings. In conventional designs, introducing a new SF condition necessitates a minimum number of experimental trials, often ranging from 50 to 100 trials.
34 Sampling the CSF at five to 10 SFs typically demands 500 to 1000 trials, translating to 30 to 60 minutes of data collection. This volume of data, although acceptable for experiments focusing on the CSF in a single condition, becomes impractical for measuring multiple CSFs (e.g., for different eyes), even in controlled laboratory environments.
35 Conversely, the various CSF charts used in clinical settings (e.g., Arden cards, Vistech, FACT charts; see reference
36 for a review) exhibit limited flexibility and reliability.
37–41
Recognizing the significance of CSF in both basic and clinical vision and the absence of a precise and efficient assessment instrument prompted the development of the quantitative CSF (qCSF) test.
35 The method models the CSF as a four-parameter truncated log parabola function
42,43 and uses a Bayesian active learning algorithm to optimally estimate the posterior distribution of the four parameters by selecting the stimulus for each subsequent trial that maximizes the expected information gain.
35,44 Using a 10-letter identification task, the assessment of the CSF can be achieved in about 20 trials—approximately two minutes—yielding an average standard deviation (SD) of 0.10 log
10 units of the CSs estimated from the parametric CSF model.
45,46 A recent analysis showed that qCSF, with the optimal information gain strategy, generated significantly more expected information gain than the Pelli-Robson and the CSV-1000 tests.
47
Since its debut, qCSF has been used to evaluate spatial vision in both normal
48–53 and clinical populations. Applications include aging,
51,54 amblyopia,
20,21,55 cataracts,
15,56 central serous chorioretinopathy, diabetic retinopathy,
17,18,57,58 dysthyroid optic neuropathy,
59 Fuchs uveitis syndrome,
56 glaucoma,
10,60–62 idiopathic epiretinal membrane,
19 inherited retinal degeneration,
23 keratoconus,
63,64 multiple sclerosis,
25,26 myopia
27–31, and various maculopathies
65 such as age-related macular degeneration,
11–13,22,66–68 geographic atrophy,
12,69 macula-off retinal detachment,
24 and retinal vein occlusion.
70,71 However, most studies with qCSF
34 have used the area under the log CSF (AULCSF),
10–12,14,15,17–19,22,24,26–30,46,50,54–57,59,62,63,67,69,71–80 whereas some have used CS at individual SFs
10–12,14,15,17–19,22,24,26–30,50,51,54,56,59,63,71,73–75,78–80 for statistical inference because of the undesirable mathematical properties associated with the CSs computed from the parametric CSF model.
43 Relying on AULCSF or CS at single SFs overlooks the rich information available across SFs in the CSF, thereby limiting the full potential of the CSF test.
In the four-parameter truncated log parabola CSF model, CSs across all SFs are determined by the mathematical function. Consequently, the CS values generated at different SFs are not independent, given their deterministic relationship dictated by the model. This means that the CS at any SF can be derived if CS values at four specific SFs are known. When computing samples of CSF from the posterior distributions of the model parameters, CSs between pairs of SFs exhibit correlation due to their deterministic connection, reflecting the mathematical properties of the CSF model rather than the true correlation between independently estimated CS values at each SF. To perform meaningful statistical inference on CS at individual SFs or combinations of multiple SFs, it is essential to eliminate the a priori deterministic relationship introduced by the parametric model while considering the true correlations of CS values across SFs.
On the other hand, estimating CS from qCSF data at individual SFs without a parametric CSF model is highly challenging because of the limited number of trials at each SF in a qCSF assessment (
Fig. 1).
Figure 1a illustrates the distribution of test stimuli in a typical 25-trial qCSF assessment,
46 binned at the six SFs designated by the Food and Drug Administration (FDA) (
Table). Note that each qCSF trial consists of three optotypes with the same SF characteristics but different contrasts and that SFs near and greater than 18 cpd were labeled as 18 cpd in this study. Attempting to estimate CS at each SF independently with this amount of data is exceptionally challenging and can lead to significant uncertainty.
Figure 1b displays the estimated CSF from the qCSF data in
Figure 1a, obtained by fitting a psychometric function to the data at each SF using the Bayesian inference procedure (BIP) described later in this article. With an average SD of 0.23 ± 0.02 log
10 units across SFs, which is equal to 70% of the estimated CS value in linear units, the estimated CSF is highly imprecise.
Figure 1c shows the distribution of the number of qCSF trials at each of the six SFs based on the data from 112 subjects tested in three luminance conditions with 25 qCSF trials.
46 On average, there are 2.5 ± 0.1, 4.3 ± 0.1, 3.0 ± 0.1, 5.0 ± 0.11, 4.2 ± 0.1, and 6.0 ± 0.1 trials at the six SFs.
Figure 1d illustrates the distributions of the SDs of the estimated CS at each of the six SFs based on the data of the 112 subjects using the BIP, after excluding those with CS < 0.0 log
10 units. With an average SD of 0.252 ± 0.003 log
10 units across all SFs and subjects, which is equal to 79% of the estimated CS values in linear units, the estimated CSs are highly imprecise.
In this investigation, our aim was to enable advanced statistical inference on changes of CS at individual SFs and across multiple SFs by analyzing the joint posterior distribution of CS across subjects, SFs and experimental conditions. We introduced a nonparametric hierarchical Bayesian model (HBM) structured across population, individual and test levels to compute the joint posterior distribution of CS at the six FDA-designated SFs across all three levels. For comparison, we also implemented the traditional BIP, which computes the posterior distribution of CS at each SF independently.
Within the Bayesian inference framework, two critical elements influencing the accuracy and precision of the estimated posterior distribution, given a fixed likelihood function, are the prior and the amount of data. The prior represents the probability distribution of the parameters to be estimated before collecting new data. In the ideal scenario, the prior serves as a mathematical description of our knowledge about the parameters before data collection, often expressed as a uniform distribution in cases of little or no prior knowledge (
Fig. 2a, column 1), or as a concentrated informative distribution with abundant knowledge (
Fig. 2b, column 1). In cases of misinformed priors (
Fig. 2c, column 1), bias may be introduced. The more informative the prior, the higher the accuracy and precision of the estimated posterior distribution with the same amount of data (
Fig. 2, columns 2–5). To achieve target levels of precision and accuracy, the most informative prior (
Fig. 2b) requires the least amount of data, whereas the biased prior demands the most (
Fig. 2c). As the amount of data increases, the impact of the prior diminishes (
Fig. 2, column 5). This article primarily focuses on constructing informative priors using the HBM because of the limited data available from qCSF assessments. By explicitly modeling the covariance of CSs at the population and individual levels, as well as conditional dependencies across the three levels, the HBM generates an informative prior for CS at each SF in each test by incorporating information across all SFs and tests in a dataset.
The HBM functions as a generative model framework that leverages Bayes’ rule to quantify the joint distribution of population-, individual-, and test-level hyperparameters and parameters.
81–85 It can explicitly quantify correlations in the data through covariance parameters.
86–90 By sharing information within and across levels via conditional dependencies, it generates informative priors for each test and can therefore reduce the variance of test-level estimates by shrinking estimated parameters at the lower levels toward the modes of the higher levels when there is insufficient data at the lower levels.
81,84,91
The HBM has found applications in diverse scientific disciplines, including astronomy,
92 ecology,
93,94 genetics,
95 machine learning,
96 and cognitive science.
82,84,85,91,97–102 Previous applications demonstrated that the HBM reduces uncertainties in the estimated parameters of the CSF
103 and visual acuity (VA) behavioral function
104 as well as uncertainties in estimated learning curves
105,106 compared to the BIP. Another development introduced a hierarchical Bayesian joint modeling (HBJM) framework to compute the collective endpoint (CE) from CSF and VA measurements, showing that CE offers more statistical power than the CSF or VA metrics alone.
107 (In the HBJM, CSF was modeled as a three-parameter log parabola function, rather than as CSs at specific SFs in the current HBM. Additionally, the HBJM modeled the relationship between CSF and VA parameters, whereas the current HBM modeled the relationship between CSs at different SFs.)
In this study, we applied the HBM and BIP to both the first 25 and all 50 trials of a publicly available dataset involving 112 subjects. These subjects underwent 50 qCSF trials in each of three luminance conditions, with two repeated tests in the high luminance condition.
46 The selected luminance levels (L, M, and H) were designed to induce 0.14, 0.29, and 0.43 log
10 units AULCSF changes, mimicking mild, medium, and large CSF deficits in clinical populations.
108–113 After acquisition of posterior distributions of CS from the HBM and BIP, we conducted comparisons across various aspects: (1) mean and uncertainty of the estimated CSs, (2) test-retest reliability of the estimated CSs in the H condition, (3) sensitivity at 95% specificity and accuracy in detecting CS changes between luminance conditions for each subject, and (4) statistical power in detecting CS changes between luminance conditions at individual and across multiple SFs at the group-level. These comparative analyses aimed to provide insights into the performance of the two methods and to highlight the significance of the HBM in enhancing statistical inference on CS at individual SFs and across multiple SFs.