Here, we have outlined a robust method that provides a quantitative description of the variability of an outcome measure, and in addition quantifies the 95% CI for this variability measurement for a small cohort of patients. The coefficients of repeatability and associated CIs provide the basis of defining the minimum change required to assess safety and efficacy of a treatment. We emphasize that these quantitative descriptions of variability alone do not necessarily define clinically meaningful change needed to address safety and efficacy issues. Instead, these RCs define the minimum level of change required in a parameter to be considered statistically different from baseline. For example, the analysis of Worse Eye acuity indicated that a change of between four and eight letters at follow-up could be used as evidence of significant change from baseline. Grover et al.
37 reported a similar criterion of seven letters for significant change in visual acuity for subjects with retinitis pigmentosa. However, a clinically meaningful change in visual acuity is frequently taken to be 15 letters (i.e., doubling of the visual angle).
55
To our knowledge, no previous study has examined variability of ERG a-wave to the ISCEV standard maximal response flash. For the XLRS subjects in this study, the RCs for the ERG a-wave indicated that the criterion for a significant reduction in a-wave amplitude was approximately 40%. When the 95% CIs were taken into account, the criterion increased to a reduction of 50% to 56% for ERG a-wave amplitude. In previous studies, examining variability of the dim flash ERG b-wave in rentinitis pigmentosa patients and control subjects, the criteria for significant reduction in ERG b-wave amplitude ranged from 31% to 51%.
40,41,43,44 Lower short-term variability (< 2 weeks) in healthy subjects was obtained by carefully controlling recording conditions,
44 and considerably lower thresholds for significant reductions of rod (23%) and cone (37%) a-waves in response to high intensity flashes have been reported.
42 Variability for 30-Hz flicker amplitude is typically higher than for the scotopic b-wave.
42,43 These combined results suggest that a 50% reduction in amplitude may be a suitable criterion for a clinically meaningful change in the ERG a-wave recorded to a standard ISCEV maximal flash.
As for the ERG a-wave, to our knowledge, no previous study has examined variability of ERG b/a ratio to the ISCEV standard maximal response flash. The ERG b/a wave ratio serves as an excellent parameter by which to judge efficacy in a clinical trial for XLRS. Successful treatment of XLRS could be expected to improve signaling from photoreceptors to the bipolar cells, which in turn would be reflected as an increase in the ERG b/a ratio. The upper limits in the criteria for significant change in the ERG b/a ratios were 0.44 for the Better Eye and 0.23 for the Worse Eye groups. Examination of the criteria for significant change in ERG b/a ratio may help define eligible patients for a clinical trial. For our system, mean b/a ratio for a healthy 25-year-old subject is 1.88. Allowing for the maximum criterion for significant improvement of 0.44 for the ERG b/a ratio, candidates for a treatment trial to evaluate efficacy would ideally have a maximal b/a wave ratio of less than or equal to 1.44.
In the present study, four of the subjects had no b-wave response, consistent with an absence of photoreceptor to bipolar cell signaling, and the floor for the b/a ratio was estimated at 0.6. Other studies have shown smaller b/a ratios of 0.3, as in four affected male relatives, age 32 to 45 years, with a mutation causing null-RS expression,
56 and this parameter may change slowly with time.
13,14,57 Therefore, the ERG b-wave is likely not a good candidate for assessing safety in a clinical trial of XLRS.
Defining what constitutes a clinically meaningful change in retinal sensitivity is perhaps less intuitive than for visual acuity and ERG measures. For MP1 mean macula sensitivity, the repeatability coefficients of 1.7 to 2.2 dB reported herein are similar to the values reported for age-related macular degeneration (AMD),
58 ABCA4 retinopathy,
36 and for intrasession variability in patients with unspecified macular disease.
35 However, mean sensitivity may not be the most appropriate measure unless a treatment is expected to affect a large section of the retina, such as the central 20° field recorded with the 10-2 pattern. To that end, we examined mean sensitivity across the central 10°, which exhibited similar, albeit slightly higher, variability (
Table 4). The disadvantage of mean sensitivity is that spatial information is lost. It is possible that a treatment for XLRS may restore function only at the margins of the central scotoma. Under this scenario, the sensitivity of some individual points may change substantially without affecting overall mean sensitivity. To account for this possibility, an understanding of the variability of point-wise sensitivity is important. The coefficients of repeatability for PWS in our XLRS patients were 6.8 and 5.4 dB for the Better and Worse Eye groups, respectively. These values are consistent with the 5.6 dB reported by Chen and colleagues
35 in an intrasession variability study of 50 maculopathy patients. Cideciyan et al.
36 reported a slightly lower repeatability coefficient of 4.2 dB for PWS in ABCA4 retinopathy with the MP1, but this lower value likely reflects the smoothing of individual thresholds by applying a 3-point spatial moving average. Wu et al.
58 similarly found a lower coefficient of repeatability for PWS of 4.2 dB in AMD patients using a macular integrity assessment (MAIA) fundus-guided perimeter that uses a scanning laser ophthalmoscope for fundus tracking. Of concern are the large confidence intervals (CIs) in the RCs for our PWS data, which extends the criteria for significant change of an individual point to 8.0 dB for the Worse Eye and 10.4 dB for the Better Eye groups. The MP1 has a stimulus range of 20 dB and our results indicate that a change of up to one-half the range is required to be considered significant. A possible alternative would be the use of scotopic two-color microperimetry that extends the range of the MP1 to 50 dB by incorporating measurement of rod function.
59
XLRS typically involves the macula although schisis cavities may extend into the peripheral retina,
9,10 an area that could be examined in XLRS patients using standard automated perimetry (SAP). Coefficients of repeatability for PWS with SAP compare with the values reported here for XLRS and range from 5 to 7 dB in diabetics,
60 glaucoma,
61 and retinitis pigmentosa.
41,62 However, PWS variability of SAP dramatically increases with retinal eccentricity
60 and with decreasing retinal sensitivity, likely reflecting unsteady fixation or changes in eccentric fixation locus that cannot be compensated for with standard perimetry. Kinetic perimetry provides a relatively precise measurement of the edge of scotomas in retinal disease. Bittner et al.
63 reported a favorable 43% coefficient of repeatability for intrasession change in planimetric area in retinitis pigmentosa. The disadvantage of kinetic perimetry is that spatial information about the variability of sensitivity within the seeing area is lost and variability increases substantially in subjects with poor acuity and narrow visual fields.
63 In the present study, we used fundus-guided perimetry to examine retinal sensitivity across the central 20° where schisis cavities are most prevalent in XLRS. The major advantages of retinal-guided perimetry over SAP and kinetic perimetry for measuring retinal sensitivity in maculopathies such as XLRS, include (1) the ability to compensate for eccentric and/or poor fixation stability, (2) the follow-up feature that enables the exact same areas of the retina to be tested longitudinally in a clinical trial, and (3) the ability to study structure–function correlations by overlaying retinal images with maps of retinal function. Changes in peripheral retina function are not likely to be localized following intravitreal delivery of treatment and may well be reflected by changes in the full-field ERG.
In our XRLS patients, the criteria for significant change in central subfield retinal thickness (0.107 logOCT) corresponded to a 22% decrease or 28% increase in thickness. That is, for a XLRS subject with central retinal thickness of 500 μm at baseline, a decrease of greater than 110 μm or an increase greater than 140 μm at follow-up would be considered significant. A retrospective study of diabetics with refractory diabetic macular edema (DME) similarly reported central retinal thickness varied by up to 28% of median thickness over a 7-month period.
64 In contrast, for diabetic subjects without macular edema or with regressed macular edema, OCT measurements of central subfield retinal thickness varied by less than 11% of median thickness over periods extending from 17 to 22 months.
64,65
In summary, an appreciation of range of the true repeatability coefficient for a parameter is fundamental to setting the minimum safety and efficacy limits for clinical trials. The 95% CI of the RC is crucial when variability is derived from a small number of subjects as in our study. Our derivation of variability, which was based on four visits across 6 months is a more general form of the classical test–retest paradigm and provides the additional advantage of assessing variability as a function of time. This is an important consideration in designing clinical trials where change in a parameter will be measured for an extended period relative to baseline.