May 2018
Volume 7, Issue 3
Open Access
Articles  |   June 2018
The Reliability of Cone Density Measurements in the Presence of Rods
Author Affiliations & Notes
  • Jessica I. W. Morgan
    Scheie Eye Institute, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
    Center for Advanced Retinal and Ocular Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
  • Grace K. Vergilio
    Scheie Eye Institute, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
  • Jessica Hsu
    Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, USA
  • Alfredo Dubra
    Department of Ophthalmology, Stanford University, Stanford, CA, USA
  • Robert F. Cooper
    Scheie Eye Institute, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
    Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA
  • Correspondence: Jessica I. W. Morgan, 3400 Civic Center Blvd, Ophthalmology 3rd floor west, 3-112W, Philadelphia, PA 19104, USA. e-mail: jwmorgan@pennmedicine.upenn.edu 
Translational Vision Science & Technology June 2018, Vol.7, 21. doi:https://doi.org/10.1167/tvst.7.3.21
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Jessica I. W. Morgan, Grace K. Vergilio, Jessica Hsu, Alfredo Dubra, Robert F. Cooper; The Reliability of Cone Density Measurements in the Presence of Rods. Trans. Vis. Sci. Tech. 2018;7(3):21. https://doi.org/10.1167/tvst.7.3.21.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: Recent advances in adaptive optics scanning light ophthalmoscopy (AOSLO) have enabled visualization of cone inner segments through nonconfocal split-detection, in addition to rod and cone outer segments revealed by confocal reflectance. Here, we examined the interobserver reliability of cone density measurements in both AOSLO imaging modalities.

Methods: Five normal subjects (nine eyes) were imaged along the horizontal and vertical meridians using a custom AOSLO with confocal and nonconfocal split-detection modalities. The resulting images were montaged using a previously described semiautomatic algorithm. Regions of interest (ROIs) were selected from the confocal montage at 190 μm, and from split-detection and confocal montages at 900 and 1800 μm from the fovea. Four observers (three experts, one naïve) manually identified cone locations in each ROI, and these locations were used to calculate bound densities. Intraclass correlation coefficients and Dice's coefficients were calculated to assess interobserver agreement.

Results: Interobserver agreement was high in cone-only images (confocal 190 μm: 0.85; split-detection 900 μm: 0.91; split-detection 1800 μm: 0.89), moderate in confocal images at 900 μm (0.68), and poor in confocal images at 1800 μm (0.24). Excluding the naïve observer data substantially increased agreement within confocal images (190 μm: 0.99; 900 μm: 0.80; 1800 μm: 0.68).

Conclusions: Interobserver measurements of cone density are more reliable in rod-free retinal images. Moreover, when using manual cell identification, it is essential that observers are trained, particularly for confocal AOSLO images.

Translational Relevance: This study underscores the need for additional reliability studies in eyes containing pathology where identifying cones can be substantially more difficult.

Introduction
Since its invention, confocal adaptive optics scanning laser ophthalmoscopy (AOSLO)1 has enabled the routine visualization of the cone photoreceptor mosaic geometry in normal and diseased retina, noninvasively.2 However, it was several years before technical improvements to confocal AOSLO design allowed regular visualization of the rod mosaic.3 Even more recently, another AOSLO imaging technique has enabled visualization of what is likely the cone inner segment mosaic. This technique, dubbed “nonconfocal split-detection,”4 involves collecting nonconfocal light backscattered by the retina that is equally split between two synchronized point detectors. While split-detection imaging has enhanced investigators' view of the cone mosaic, it is as yet unable to show cone inner segments near the fovea or rod inner segments in most subjects, with the exception of subjects who have very short axial lengths or some subjects with pathological conditions. 
With the ability to visualize the photoreceptor mosaic comes the ability to quantify mosaic parameters using numerous metrics.5,6 To date, the most routinely used metric for assessing photoreceptor images is cone density and the critical step for its direct measurement is the identification of all cones within a region of interest (ROI). Automated cone identification algorithms, in particular those using local maxima in intensity to identify cone locations,7,8 fail to identify only cones in images where rods and cones are interspersed because rods, like cones, appear as bright dots in confocal AOSLO images. Moreover, even manual identification of cones in images with both photoreceptor types became more difficult because graders now need to distinguish whether a photoreceptor in the image is a cone or rod and the multimodal wave-guided reflectance of cones outside of the parafoveal region can appear as a group of small bright dots. Thus, while the ability to image the rods represented a substantial improvement for noninvasive visualization of the retina, it came with the added complication that cone mosaic parameters became difficult to quantify in confocal images containing both rods and cones. 
What follows is an investigation of the impact of rod photoreceptor presence on the interobserver reliability of normal manual cone density measurements. We assessed interobserver reliability of cone density measurements from AOSLO images where either only the cone mosaic is visualized (foveal confocal images and perifoveal split-detection images) or the rod and cone mosaics are interleaved (perifoveal confocal images). Understanding the reliability of reported cone density values is essential for characterizing retinal phenotypes and will be valuable as clinical trials testing experimental therapeutics consider incorporating cone density as an outcome measure. 
Methods
This study was approved by the institutional review board at the University of Pennsylvania, and followed the tenets of the Declaration of Helsinki. Following explanation of the study, subjects gave informed consent and voluntarily participated. 
Nine eyes of five subjects ages 24 to 35 with no known retinal pathology were included in this study. Axial lengths for each subject's eyes were obtained using an IOL Master (Carl Zeiss Meditec, Dublin, CA). AOSLO image scale was determined by acquiring images of a Ronchi ruling positioned at the focal plane of a lens with a 19-mm focal length to determine the conversion between image pixels and degrees. We then used a proportional axial length method to approximate the retinal magnification factor (in microns/degree) to convert the angular scale to microns/pixel.9 Subjects' pupils were cyclopleged with phenylephrine hydrochloride (2.5%) and tropicamide (1%). 
The custom AOSLO used in this study has been previously described.4,10 A dental impression was used to align the subject to the AOSLO. An 848-nm superluminescent diode with a full-width at half-maximum (FWHM) bandwidth of 26 nm (Superlum, Cork, Ireland) was used for wavefront sensing, and a 97-actuator deformable mirror (Alpao SAS, France) provided the aberration correction. Confocal and nonconfocal split-detection images were acquired simultaneously at 16.7 frames per second over a 1° by 1° field of view using a superluminescent diode centered at 795 nm with FWHM of 15.3 nm (Superlum) and three photomultiplier tubes (PMT; Hamamatsu Corporation, Japan) configured as previously described.4 
Subjects were instructed to fixate (using the imaged eye) as steadily as possible at a target while the AOSLO image sequences were acquired along all four retinal meridians out to approximately 1800 μm from the fovea. A custom strip-registration algorithm was used for intraframe strip based registration and dewarping of the AOSLO images.11 Reference frames for registration were chosen manually from the confocal image sequence, and 50 frames of the confocal AOSLO images were registered and averaged. The same transformations that were applied to the confocal images were applied to the 50 simultaneously acquired nonconfocal split-detection images, as described in other simultaneous multimodal imaging paradigms.12 Averaged confocal and nonconfocal split-detection images were then automatically montaged (Fig. 1) using a previously described algorithm.13 Square 80, 93, and 100 μm per side ROIs were extracted at 190, 900, and 1800 μm, respectively, from the fovea along all four meridians in the confocal montages and at 900 and 1800 μm in the nonconfocal split-detection montages. When necessary, ROIs were minimally displaced to avoid the shadows of retinal blood vessels. One subject's montage did not extend out to 1800 μm inferior, resulting in a total of 178 ROIs for the study. 
Figure 1
 
Example confocal and split-detection AOSLO images and montages showing the photoreceptor mosaic in the right eye of subject 11048. Twenty exemplar ROIs used for manual cone identification and cone density analysis are shown in the left panels. White boxes within the montage on the right outline the locations of the ROIs: 190 μm (confocal only), 900 μm and 1800 μm (both confocal and split-detection) along the superior, inferior, nasal, and temporal meridians. Scale bars for the ROI images: 25 μm.
Figure 1
 
Example confocal and split-detection AOSLO images and montages showing the photoreceptor mosaic in the right eye of subject 11048. Twenty exemplar ROIs used for manual cone identification and cone density analysis are shown in the left panels. White boxes within the montage on the right outline the locations of the ROIs: 190 μm (confocal only), 900 μm and 1800 μm (both confocal and split-detection) along the superior, inferior, nasal, and temporal meridians. Scale bars for the ROI images: 25 μm.
Three expert graders and one naïve grader manually identified cone locations once per ROI. The naïve grader was instructed on the use of the custom software and was verbally instructed on how to make cone selections using a test set of images separate from the images included in this study. The order in which the 178 ROIs were presented was randomized, and graders were masked to the subject, eye, and meridian. Graders were able to adjust the image brightness and contrast in both linear and logarithmic displays to aid in determining the presence of cone photoreceptors. Graders were instructed to mark cell centers by manually clicking on each cell. These locations were used to determine the Voronoi mosaic and only cone locations whose Voronoi cells were fully contained within the ROI were used for the cone density calculation. Bound cone density was then calculated by dividing the number of bound Voronoi cells by their area.5 
We undertook two different analyses to assess interobserver agreement for cone density measurements and cone identifications. First, interobserver agreement in cone density measurements was assessed at 190 for confocal images and at 190, 900, and 1800 μm for both confocal and nonconfocal split-detection images using intraclass correlation (ICC) coefficients with 95% confidence intervals (CIs). Agreement was assessed between the three expert reviewers, as well as between the three expert and one naïve graders. Agreement between cone density measurements from confocal and nonconfocal split-detection images at identical retinal locations were also compared between observers. A paired t-test was used to assess whether cone density in split-detection images was significantly different from cone density in confocal images for each observer. 
Second, we compared the sensitivity and precision of cone identifications between pairs of observers for each image. We found observer co-located cones using the following method: First, we determined the average and standard deviation of the nearest neighbor distance across all coordinates from each expert observer within each image. We then clustered cone locations across all observers within each image by grouping cone locations that were located less than the mean nearest neighbor distance plus two standard deviations of the nearest neighbor distance of a given cone. Only one cone selection per observer was allowed in each cluster. We then assessed the similarity between cone identifications from each of the other observers independently. To do this, we considered one expert observer to be “ground truth” and found the number of true positives (denoted Display Formula\(\def\upalpha{\unicode[Times]{x3B1}}\)\(\def\upbeta{\unicode[Times]{x3B2}}\)\(\def\upgamma{\unicode[Times]{x3B3}}\)\(\def\updelta{\unicode[Times]{x3B4}}\)\(\def\upvarepsilon{\unicode[Times]{x3B5}}\)\(\def\upzeta{\unicode[Times]{x3B6}}\)\(\def\upeta{\unicode[Times]{x3B7}}\)\(\def\uptheta{\unicode[Times]{x3B8}}\)\(\def\upiota{\unicode[Times]{x3B9}}\)\(\def\upkappa{\unicode[Times]{x3BA}}\)\(\def\uplambda{\unicode[Times]{x3BB}}\)\(\def\upmu{\unicode[Times]{x3BC}}\)\(\def\upnu{\unicode[Times]{x3BD}}\)\(\def\upxi{\unicode[Times]{x3BE}}\)\(\def\upomicron{\unicode[Times]{x3BF}}\)\(\def\uppi{\unicode[Times]{x3C0}}\)\(\def\uprho{\unicode[Times]{x3C1}}\)\(\def\upsigma{\unicode[Times]{x3C3}}\)\(\def\uptau{\unicode[Times]{x3C4}}\)\(\def\upupsilon{\unicode[Times]{x3C5}}\)\(\def\upphi{\unicode[Times]{x3C6}}\)\(\def\upchi{\unicode[Times]{x3C7}}\)\(\def\uppsy{\unicode[Times]{x3C8}}\)\(\def\upomega{\unicode[Times]{x3C9}}\)\(\def\bialpha{\boldsymbol{\alpha}}\)\(\def\bibeta{\boldsymbol{\beta}}\)\(\def\bigamma{\boldsymbol{\gamma}}\)\(\def\bidelta{\boldsymbol{\delta}}\)\(\def\bivarepsilon{\boldsymbol{\varepsilon}}\)\(\def\bizeta{\boldsymbol{\zeta}}\)\(\def\bieta{\boldsymbol{\eta}}\)\(\def\bitheta{\boldsymbol{\theta}}\)\(\def\biiota{\boldsymbol{\iota}}\)\(\def\bikappa{\boldsymbol{\kappa}}\)\(\def\bilambda{\boldsymbol{\lambda}}\)\(\def\bimu{\boldsymbol{\mu}}\)\(\def\binu{\boldsymbol{\nu}}\)\(\def\bixi{\boldsymbol{\xi}}\)\(\def\biomicron{\boldsymbol{\micron}}\)\(\def\bipi{\boldsymbol{\pi}}\)\(\def\birho{\boldsymbol{\rho}}\)\(\def\bisigma{\boldsymbol{\sigma}}\)\(\def\bitau{\boldsymbol{\tau}}\)\(\def\biupsilon{\boldsymbol{\upsilon}}\)\(\def\biphi{\boldsymbol{\phi}}\)\(\def\bichi{\boldsymbol{\chi}}\)\(\def\bipsy{\boldsymbol{\psy}}\)\(\def\biomega{\boldsymbol{\omega}}\)\(\def\bupalpha{\bf{\alpha}}\)\(\def\bupbeta{\bf{\beta}}\)\(\def\bupgamma{\bf{\gamma}}\)\(\def\bupdelta{\bf{\delta}}\)\(\def\bupvarepsilon{\bf{\varepsilon}}\)\(\def\bupzeta{\bf{\zeta}}\)\(\def\bupeta{\bf{\eta}}\)\(\def\buptheta{\bf{\theta}}\)\(\def\bupiota{\bf{\iota}}\)\(\def\bupkappa{\bf{\kappa}}\)\(\def\buplambda{\bf{\lambda}}\)\(\def\bupmu{\bf{\mu}}\)\(\def\bupnu{\bf{\nu}}\)\(\def\bupxi{\bf{\xi}}\)\(\def\bupomicron{\bf{\micron}}\)\(\def\buppi{\bf{\pi}}\)\(\def\buprho{\bf{\rho}}\)\(\def\bupsigma{\bf{\sigma}}\)\(\def\buptau{\bf{\tau}}\)\(\def\bupupsilon{\bf{\upsilon}}\)\(\def\bupphi{\bf{\phi}}\)\(\def\bupchi{\bf{\chi}}\)\(\def\buppsy{\bf{\psy}}\)\(\def\bupomega{\bf{\omega}}\)\(\def\bGamma{\bf{\Gamma}}\)\(\def\bDelta{\bf{\Delta}}\)\(\def\bTheta{\bf{\Theta}}\)\(\def\bLambda{\bf{\Lambda}}\)\(\def\bXi{\bf{\Xi}}\)\(\def\bPi{\bf{\Pi}}\)\(\def\bSigma{\bf{\Sigma}}\)\(\def\bPhi{\bf{\Phi}}\)\(\def\bPsi{\bf{\Psi}}\)\(\def\bOmega{\bf{\Omega}}\)\({N_{TP}}\)), false positives (comparison observer identified a cone when the “ground truth” expert observer did not, denoted Display Formula\({N_{FP}}\)), and the number of false negatives (comparison observer did not identify a cone when the “ground truth” expert observer did, denoted Display Formula\({N_{FN}}\)) for each of the other observers, including only cones with bound Voronoi areas. Thus, the number of cone identifications made from each observer can be expressed as:  
\begin{equation}{N_{comparison\ observer}} = {N_{TP}} + {N_{FP}}\end{equation}
 
\begin{equation}{N_{ground\ truth\ expert}} = {N_{TP}} + {N_{FN}}\end{equation}
In order to compare data sets from different observers, we then calculated the true positive rate, the false discovery rate, and Dice's coefficient for each image as:  
\begin{equation}true\ positive\ rate = {N_{TP}}/{N_{ground\ truth\ expert}}\end{equation}
 
\begin{equation}{\it{false\ discovery\ rate}} = {N_{FP}}/{N_{comparison\ observer}}\end{equation}
 
\begin{equation}{\it{Dice{\mbox{'}}s\ coefficient}} = 2{N_{TP}}/\left( {{N_{ground\ truth\ expert}} + {N_{ground\ truth\ expert}}} \right)\end{equation}
where Dice's coefficient is used as a metric for describing the similarity between two data sets.1416 We considered all permutations between graders where each expert grader was considered the ground truth and all other observers' cone identifications were compared to that expert's identifications. (For example, first considering expert observer 1 as ground truth and comparing observer 2 to 1, 3 to 1, and 4 to 1. Then, considering expert observer 2 as ground truth, and comparing 1 to 2, 3 to 2, and 4 to 2, etc.) We did not perform an analysis considering the naïve observer's cone identifications as ground truth.  
Results
Cone density measurements by the three expert observers showed a range of agreement (Fig. 2). Confocal images of the parafoveal cone mosaic (190 μm ROIs) demonstrated the highest agreement between expert observers when compared to other retinal eccentricities (Table 1). Measurements of cone density from split-detection images showed higher agreement between expert observers than measurements of cone density from confocal images at the same locations. 
Figure 2
 
The range in agreement between cone density measurements (highest, median, and lowest) made by the three expert observers for all five ROI types: confocal 190 μm, confocal 900 μm, confocal 1800 μm, split-detection 900 μm, and split-detection 1800 μm. Manually identified cones identified by all three expert observers are marked in orange. Locations identified as a cone by only one of the three expert observers are denoted by a colored dot. Observers 1, 2, and 3 are yellow, blue, and magenta, respectively. Locations identified as cones by two of the three expert observers are denoted by an X, with a color corresponding to the observer who did not identify that location as a cone. Confocal images at 190 μm showed the highest agreement for cone density measurements between expert observers. Cone density measurements made using split-detection images showed a higher interobserver agreement than confocal images at the same retinal eccentricity. The lowest agreement confocal image at 900 μm corresponds to the outlier in Figure 4C, where it is likely that observer 3 misidentified rods as cones in this confocal image. Scale bars: 25 μm.
Figure 2
 
The range in agreement between cone density measurements (highest, median, and lowest) made by the three expert observers for all five ROI types: confocal 190 μm, confocal 900 μm, confocal 1800 μm, split-detection 900 μm, and split-detection 1800 μm. Manually identified cones identified by all three expert observers are marked in orange. Locations identified as a cone by only one of the three expert observers are denoted by a colored dot. Observers 1, 2, and 3 are yellow, blue, and magenta, respectively. Locations identified as cones by two of the three expert observers are denoted by an X, with a color corresponding to the observer who did not identify that location as a cone. Confocal images at 190 μm showed the highest agreement for cone density measurements between expert observers. Cone density measurements made using split-detection images showed a higher interobserver agreement than confocal images at the same retinal eccentricity. The lowest agreement confocal image at 900 μm corresponds to the outlier in Figure 4C, where it is likely that observer 3 misidentified rods as cones in this confocal image. Scale bars: 25 μm.
Table 1
 
Agreement in Cone Density Measurements Made by Three Expert Observers
Table 1
 
Agreement in Cone Density Measurements Made by Three Expert Observers
Using the scale described by Cicchetti17 (>0.75, excellent; 0.6–0.74, good; 0.4–0.59 fair; <0.4 poor), expert interobserver agreement was excellent in cone-only images (confocal OS 190 μm: 0.99, CI: 0.98–1.0; split-detection IS 900 μm: 0.95, CI: 0.91–0.97; split-detection IS 1800 μm: 0.89, CI: 0.82–0.94). Interobserver agreement was also excellent in confocal OS images at 900 μm and good in confocal OS images at 1800 μm (900 μm: 0.80, CI: 0.68–0.88; 1800 μm: 0.68, IC: 0.52–0.82). Observers 1 and 3 exhibited a significant difference toward a higher density (640 and 920 cones/mm2, P = 0.005 and P < 0.001, respectively) in confocal OS images compared to split-detection IS images at the same location. Conversely, observer 2 showed a significant difference toward a higher density (550 cones/mm2, P < 0.001) in split-detection IS images compared to confocal images at the same location (Figs. 3, 4). This corresponds to 5.3 and 6.5 more bound manual cone identifications over an ROI in confocal images in comparison to split-detection images for observers 1 and 3 respectively, and 4.7 more bound manual cone identifications over an ROI in split-detection images in comparison to confocal images for observer 2. 
Figure 3
 
Cone identifications from paired confocal and split-detection images showing the highest and lowest correlation on average between cone density measurements overlaid on split-detection images for each of the three expert observers. Orange dots show locations where an expert observer identified a cone in both the split-detection and confocal image. Red and blue dots show locations where a cone was identified in either the confocal or split-detection image only (red, confocal; blue, split-detection). Scale bars: 25 μm.
Figure 3
 
Cone identifications from paired confocal and split-detection images showing the highest and lowest correlation on average between cone density measurements overlaid on split-detection images for each of the three expert observers. Orange dots show locations where an expert observer identified a cone in both the split-detection and confocal image. Red and blue dots show locations where a cone was identified in either the confocal or split-detection image only (red, confocal; blue, split-detection). Scale bars: 25 μm.
Figure 4
 
Bland-Altman plots for each observer show the difference between split-detection and confocal cone densities versus the mean of the paired densities for paired ROIs at 900 and 1800 μm. Observers 1 (A) and 3 (C) measured a higher cone density in confocal images (640 and 920 cones/mm2, P = 0.005 and P < 0.001, respectively) while observer 2 (B) measured a higher cone density in split detection images (550 cones/mm2, P < 0.001). Post-hoc evaluation of the outlier points in observers 1 and 3 were caused by erroneous cone identifications in confocal images (selection of rods) more often than erroneous cone identification in split-detection images. Observer 4 (D, naïve observer) measured a higher cone density in split detection images (2300 cones/mm2, P < 0.0001) and showed more variability in agreement between confocal and split-detection measures of cone density at the same retinal locations.
Figure 4
 
Bland-Altman plots for each observer show the difference between split-detection and confocal cone densities versus the mean of the paired densities for paired ROIs at 900 and 1800 μm. Observers 1 (A) and 3 (C) measured a higher cone density in confocal images (640 and 920 cones/mm2, P = 0.005 and P < 0.001, respectively) while observer 2 (B) measured a higher cone density in split detection images (550 cones/mm2, P < 0.001). Post-hoc evaluation of the outlier points in observers 1 and 3 were caused by erroneous cone identifications in confocal images (selection of rods) more often than erroneous cone identification in split-detection images. Observer 4 (D, naïve observer) measured a higher cone density in split detection images (2300 cones/mm2, P < 0.0001) and showed more variability in agreement between confocal and split-detection measures of cone density at the same retinal locations.
Dice's coefficients between the expert observers showed high similarity on average for all eccentricities for both confocal and split-detection modalities (Table 2). On average Dice's coefficients for confocal images were highest at the 190-μm eccentricity (0.971), and decreased with increasing eccentricity for all expert observers (900 μm, 0.925; 1800 μm, 0.906). The standard deviation for Dice's coefficients over all images were also higher at perifoveal retinal locations in comparison to 190 μm. Dice's coefficients for 900 μm split-detection images were lower than those at 1800 μm (900 μm, 0.883; 1800 μm, 0.916) and were also lower than confocal images. 
Table 2
 
Similarity in Cone Identifications Made by Three Expert Observers and 1 Naïve Observer
Table 2
 
Similarity in Cone Identifications Made by Three Expert Observers and 1 Naïve Observer
Including cone density measurements made by the naïve observer with those from the experts decreased the interobserver agreement at all eccentricities (Table 3). In this case, interobserver agreement was excellent in cone-only images (confocal 190 μm: 0.85, CI: 0.76–0.91; split-detection 900 μm: 0.91, CI: 0.85–0.95; split-detection 1800 μm: 0.89, CI: 0.82–0.94), good in confocal images at 900 μm (0.68, CI: 0.55–0.80), and poor in confocal images at 1800 μm (0.24, CI: 0.09–0.44). The naïve observer showed a significant difference (P < 0.0001) toward identifying more cones in split-detection images than in confocal images and showed higher variability in agreement between cone density measurements made in paired confocal and split-detection images at the same retinal location in comparison to expert observers (Fig. 4, bottom right). In comparison to the expert observers, Dice's coefficients were reduced for the naïve observer at all eccentricities for confocal images (190 μm, 0.941; 900 μm, 0.859; 1800 μm, 0.872) but not for split detection (900 μm, 0.808; 1800 μm, 0.910; Table 2). 
Table 3
 
Agreement in Cone Density Measurements Made by Three Expert Observers and 1 Naïve Observer
Table 3
 
Agreement in Cone Density Measurements Made by Three Expert Observers and 1 Naïve Observer
Discussion
Our high interobserver agreement for confocal images at 190 μm is consistent with previous reports in normal cone-only parafoveal images.18,19 Our results in confocal images show that while interobserver reliability is considered good or excellent at all retinal locations studied, there is a measurable decrease with eccentricity (Table 1). We attribute part, but not all, of this decrease to the presence of rods in perifoveal images because interobserver reliability for split-detection images (presumed to show cones not rods) was higher than interobserver reliability for paired confocal images. However, the decrease in interobserver reliability with eccentricity is not entirely based on infiltration of rods to the images because interobserver reliability for split-detection images was lower than for cone density measurements in 190 μm confocal images and also decreased with eccentricity, though to a lesser extent than in confocal images. 
The true positive rate, false discovery rate, and Dice's coefficient also showed differences with retinal eccentricity and imaging modality. As would be expected based on the ICC agreement between cone densities, Dice's coefficient and the true positive rate decreased with increasing eccentricity while the false discovery rate increased with eccentricity for confocal images. Surprisingly, Dice's coefficient was higher for 900 μm confocal images than for 900 μm split-detection images for comparisons involving expert observer 2, despite the fact that cone density measurements showed higher interobserver agreement in split-detection images compared to confocal images. This result could occur if cell selections made by expert observer 2 were shifted relative to the cell selections made by expert observers 1 and 3, resulting in pairs of “false positives” and “false negatives” that would both fall outside of the clustering area used for determining locations considered to have one-to-one pairing. Such pairs of “false positives” and “false negatives” would cause Dice's coefficient to be reduced while not affecting cone density measurements. 
The size of an ROI is known to influence cone density.8,20 The ROI sizes selected for this study were chosen based on the method described by Cooper et al.5 and increased with eccentricity such that ROIs at 900 and 1800 μm would contain close to 100 cones, while the parafoveal ROI, which is known to exhibit tight triangular packing, would contain a large enough continuous area of the mosaic to capture subtle variations in cone packing. On average over all observers, images at 190 μm had 409 bound cones identified within the ROI in comparison to 139 bound cones at 900 μm and 91 bound cones at 1800 μm. Thus, a single cone misidentification at the perifoveal eccentricities represents a greater percentage of total cones selected and thus, could have a larger impact on interobserver agreement and Dice's coefficient than at the parafoveal eccentricity. 
Our results showed reduced reliability between observers using both ICC and Dice's coefficient when cone identifications from the naïve observer were considered. The naïve observer, new to the cone identification process, had only spent a couple of months working with adaptive optics retinal images. In contrast, the expert observers all had several years of experience in the field, and specifically, identifying cones in confocal images. In addition, all three expert reviewers had been exposed to nonconfocal split-detection imaging of cone inner segments for at least a year. However, only one of the expert reviewers had previous experience identifying cones in this imaging modality. Thus, the graders overall were less exposed to split-detection than confocal images. Given that the interobserver agreement for density was higher in split-detection images than the paired confocal images (both including and excluding the results of the naïve observer), this may show that observers require less training to agree on cone densities in the split-detection imaging modality as compared to the confocal imaging modality. Alternatively, the higher agreement could arise in part from the lack of visible rods in the split-detection images, since the confocal images at the 190 μm ROI also showed better interobserver agreement (even when including the naïve observer) than the confocal images at greater eccentricities. Regardless, these results highlight the need of training for performing manual AOSLO image analysis, especially as AO imaging becomes more accessible and reading centers are developed for interpreting AO data sets. 
Several automated algorithms for identifying normal cones in parafoveal (rod-free) confocal images7,8,2123 and split-detection images14,2325 have shown promise. However, none of these algorithms have demonstrated the ability to reliably identify cones separate from rods. As a result, confocal images at eccentricities beyond the parafovea required manual cone identifications for density analysis. Our goal for the present study was to understand how eccentricity and imaging modality, not cone identification algorithms, effect cone density reliability. For this reason, we chose to grade all images manually, rather than use semiautomated algorithms in some conditions and fully manual grading in others. Until reliable algorithms are developed for all retinal eccentricities in both normal and diseased images, manual or semiautomatic grading of cone locations will remain necessary, and limitations inherent to these methods, such as the reliability between graders, will remain present in cone mosaic measurements. 
Both expert observers 1 and 3 each had one outlier image when comparing cone density measurements made from confocal versus split-detection images (Figs. 4A, 4C). Upon re-examination, it was apparent that both outlier data points were caused by marking more rods as cones in the confocal image rather than missing cones that should have been marked in the split-detection image (Fig. 2, confocal 900 μm, lowest agreement). These types of errors again point to a yet unmet need for fully automated cone identification algorithms that utilize both confocal and split-detection image features for identifying cone locations. Until such time, one could consider allowing graders to identify cones in confocal and split-detection images of the same retinal location simultaneously or to review cone identifications prior to unmasking results to attempt to alleviate errors caused by manual grading. 
It is important to note that these data are likely a best-case: images from patients with retinal disease are typically of lower quality than the high-quality images from young normal controls that were included in our study. Therefore, image interpretation can be substantially more difficult in images of pathology compared with normal anatomy, and reports of intergrader agreement are lower in images of pathology compared with normal.26,27 Interestingly, Tanna et al.26 show higher reliability for cone density measurements in split-detection images versus paired confocal images for patients with Stargardt disease and retinitis pigmentosa GTPase regulator associated retinopathy, consistent with our results that cone identification is less difficult in split-detection images. More generally, it is important that investigators understand how a given disease affects the reliability of their cone density measurements, especially if those measurements are to be used for assessment of disease progression or as an outcome measure to determine the effect of interventions on cone morphology. This could be accomplished by establishing reliability of cone metrics for each disease, which are then applied to future studies of that disease. Alternatively, studies using cone density or other mosaic metrics could document reliability of grading within the study. 
In summary, we show it is important to consider both AOSLO image modality and retinal eccentricity when evaluating the reliability of cone metrics. Studies, such as the present one that strive to understand the confidence with which cone density measurements are reported, will be required before cone density measurements can be used as a reliable biomarker for disease phenotype and progression monitoring or as an outcome measure to assess disease intervention. 
Acknowledgments
We thank Joseph Carroll for providing cone identification software and for helpful discussions, and Gui-shuang Ying for statistical advice. 
Supported by NIH R01EY028601; NIH U01EY025477; NIH P30 EY001583; Foundation Fighting Blindness; Research to Prevent Blindness Stein Innovation Award; the F. M. Kirby Foundation; and the Paul and Evanina Mackall Foundation Trust. 
Disclosure: J.I.W. Morgan, P (US Patent 8226236), F (AGTC); G.K. Vergilio, None; J. Hsu, None; A. Dubra, P (US Patent 8226236), C (Boston Micromachines Corporation and Meira Gtx); R.F. Cooper, None 
References
Roorda A, Romero-Borja F, Donnelly WJIII, Queener H, Hebert TJ, Campbell MCW. Adaptive optics scanning laser ophthalmoscopy. Opt Express. 2002; 10: 405–412.
Morgan JI. The fundus photo has met its match: optical coherence tomography and adaptive optics ophthalmoscopy are here to stay. Ophthalmic Physiol Opt. 2016; 36: 218–239.
Dubra A, Sulai Y, Norris JL, et al. Noninvasive imaging of the human rod photoreceptor mosaic using a confocal adaptive optics scanning ophthalmoscope. Biomed Opt Express. 2011; 2: 1864–1876.
Scoles D, Sulai YN, Langlo CS, et al. In vivo imaging of human cone photoreceptor inner segments. Invest Ophthalmol Vis Sci. 2014; 55: 4244–4251.
Cooper RF, Wilk MA, Tarima S, Carroll J. Evaluating descriptive metrics of the human cone mosaic. Invest Ophthalmol Vis Sci. 2016; 57: 2992–3001.
Litts KM, Cooper RF, Duncan JL, Carroll J. Photoreceptor-based biomarkers in AOSLO retinal imaging. Invest Ophthalmol Vis Sci. 2017; 58: BIO255–BIO267.
Li KY, Roorda A. Automated identification of cone photoreceptors in adaptive optics retinal images. J Opt Soc Am A. 2007; 24: 1358–1363.
Garrioch R, Langlo C, Dubis AM, Cooper RF, Dubra A, Carroll J. Repeatability of in vivo parafoveal cone density and spacing measurements. Optom Vis Sci. 2012; 89: 632–643.
Bennett AG, Rudnicka AR, Edgar DF. Improvements on Littmann's method of determining the size of retinal features by fundus photography. Graefes Arch Clin Exp Ophthalmol. 1994; 232: 361–367.
Dubra A, Sulai Y. Reflective afocal broadband adaptive optics scanning ophthalmoscope. Biomed Opt Express. 2011; 2: 1757–1768.
Dubra A, Harvey Z. Registration of 2D images from fast scanning ophthalmic instruments. Lecture Notes Computer Sci. 2010; 6204: 60–71.
Morgan JIW, Dubra A, Wolfe R, Merigan WH, Williams DR. In vivo autofluorescence imaging of the human and macaque retinal pigment epithelial cell mosaic. Invest Ophthalmol Vis Sci. 2009; 50: 1350–1359.
Chen M, Cooper RF, Han GK, Gee J, Brainard DH, Morgan JI. Multi-modal automatic montaging of adaptive optics retinal images. Biomed Opt Express. 2016; 7: 4899–4918.
Cunefare D, Cooper RF, Higgins B, et al. Automatic detection of cone photoreceptors in split detector adaptive optics scanning light ophthalmoscope images. Biomed Opt Express. 2016; 7: 2036–2050.
Dice LR. Measures of the amount of ecologic association between species. Ecology. 1945; 26: 297–302.
Sørensen T. A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biol. Skr. 1948; 5: 1–34.
Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol Assessments. 1994; 6: 284–290.
Liu BS, Tarima S, Visotcky A, et al. The reliability of parafoveal cone density measurements. Br J Ophthalmol. 2014; 98: 1126–1131.
Bidaut Garnier M, Flores M, Debellemaniere G, et al. Reliability of cone counts using an adaptive optics retinal camera. Clin Experiment Ophthalmol. 2014; 42: 833–840.
Lombardo M, Serrao S, Ducoli P, Lombardo G. Influence of sampling window size and orientation on parafoveal cone packing density. Biomed Opt Express. 2013; 4: 1318–1331.
Chiu SJ, Lokhnygina Y, Dubis AM, et al. Automatic cone photoreceptor segmentation using graph theory and dynamic programming. Biomed Opt Express. 2013; 4: 924–937.
Bukowska DM, Chew AL, Huynh E, et al. Semi-automated identification of cones in the human retina using circle Hough transform. Biomed Opt Express. 2015; 6: 4676–4693.
Cunefare D, Fang L, Cooper RF, Dubra A, Carroll J, Farsiu S. Open source software for automatic detection of cone photoreceptors in adaptive optics ophthalmoscopy using convolutional neural networks. Sci Rep. 2017; 7: 6620.
Liu J, Jung H, Dubra A, Tam J. Automated photoreceptor cell identification on nonconfocal adaptive optics images using multiscale circular voting. Invest Ophthalmol Vis Sci. 2017; 58: 4477–4489.
Bergeles C, Dubis AM, Davidson B, et al. Unsupervised identification of cone photoreceptors in non-confocal adaptive optics scanning light ophthalmoscope images. Biomed Opt Express. 2017; 8: 3081–3094.
Tanna P, Kasilian M, Strauss R, et al. Reliability and repeatability of cone density measurements in patients with Stargardt disease and RPGR-associated retinopathy. Invest Ophthalmol Vis Sci. 2017; 58: 3608–3615.
Abozaid MA, Langlo CS, Dubis AM, Michaelides M, Tarima S, Carroll J. Reliability and repeatability of cone density measurements in patients with congenital achromatopsia. Adv Exp Med Biol. 2016; 854: 277–283.
Figure 1
 
Example confocal and split-detection AOSLO images and montages showing the photoreceptor mosaic in the right eye of subject 11048. Twenty exemplar ROIs used for manual cone identification and cone density analysis are shown in the left panels. White boxes within the montage on the right outline the locations of the ROIs: 190 μm (confocal only), 900 μm and 1800 μm (both confocal and split-detection) along the superior, inferior, nasal, and temporal meridians. Scale bars for the ROI images: 25 μm.
Figure 1
 
Example confocal and split-detection AOSLO images and montages showing the photoreceptor mosaic in the right eye of subject 11048. Twenty exemplar ROIs used for manual cone identification and cone density analysis are shown in the left panels. White boxes within the montage on the right outline the locations of the ROIs: 190 μm (confocal only), 900 μm and 1800 μm (both confocal and split-detection) along the superior, inferior, nasal, and temporal meridians. Scale bars for the ROI images: 25 μm.
Figure 2
 
The range in agreement between cone density measurements (highest, median, and lowest) made by the three expert observers for all five ROI types: confocal 190 μm, confocal 900 μm, confocal 1800 μm, split-detection 900 μm, and split-detection 1800 μm. Manually identified cones identified by all three expert observers are marked in orange. Locations identified as a cone by only one of the three expert observers are denoted by a colored dot. Observers 1, 2, and 3 are yellow, blue, and magenta, respectively. Locations identified as cones by two of the three expert observers are denoted by an X, with a color corresponding to the observer who did not identify that location as a cone. Confocal images at 190 μm showed the highest agreement for cone density measurements between expert observers. Cone density measurements made using split-detection images showed a higher interobserver agreement than confocal images at the same retinal eccentricity. The lowest agreement confocal image at 900 μm corresponds to the outlier in Figure 4C, where it is likely that observer 3 misidentified rods as cones in this confocal image. Scale bars: 25 μm.
Figure 2
 
The range in agreement between cone density measurements (highest, median, and lowest) made by the three expert observers for all five ROI types: confocal 190 μm, confocal 900 μm, confocal 1800 μm, split-detection 900 μm, and split-detection 1800 μm. Manually identified cones identified by all three expert observers are marked in orange. Locations identified as a cone by only one of the three expert observers are denoted by a colored dot. Observers 1, 2, and 3 are yellow, blue, and magenta, respectively. Locations identified as cones by two of the three expert observers are denoted by an X, with a color corresponding to the observer who did not identify that location as a cone. Confocal images at 190 μm showed the highest agreement for cone density measurements between expert observers. Cone density measurements made using split-detection images showed a higher interobserver agreement than confocal images at the same retinal eccentricity. The lowest agreement confocal image at 900 μm corresponds to the outlier in Figure 4C, where it is likely that observer 3 misidentified rods as cones in this confocal image. Scale bars: 25 μm.
Figure 3
 
Cone identifications from paired confocal and split-detection images showing the highest and lowest correlation on average between cone density measurements overlaid on split-detection images for each of the three expert observers. Orange dots show locations where an expert observer identified a cone in both the split-detection and confocal image. Red and blue dots show locations where a cone was identified in either the confocal or split-detection image only (red, confocal; blue, split-detection). Scale bars: 25 μm.
Figure 3
 
Cone identifications from paired confocal and split-detection images showing the highest and lowest correlation on average between cone density measurements overlaid on split-detection images for each of the three expert observers. Orange dots show locations where an expert observer identified a cone in both the split-detection and confocal image. Red and blue dots show locations where a cone was identified in either the confocal or split-detection image only (red, confocal; blue, split-detection). Scale bars: 25 μm.
Figure 4
 
Bland-Altman plots for each observer show the difference between split-detection and confocal cone densities versus the mean of the paired densities for paired ROIs at 900 and 1800 μm. Observers 1 (A) and 3 (C) measured a higher cone density in confocal images (640 and 920 cones/mm2, P = 0.005 and P < 0.001, respectively) while observer 2 (B) measured a higher cone density in split detection images (550 cones/mm2, P < 0.001). Post-hoc evaluation of the outlier points in observers 1 and 3 were caused by erroneous cone identifications in confocal images (selection of rods) more often than erroneous cone identification in split-detection images. Observer 4 (D, naïve observer) measured a higher cone density in split detection images (2300 cones/mm2, P < 0.0001) and showed more variability in agreement between confocal and split-detection measures of cone density at the same retinal locations.
Figure 4
 
Bland-Altman plots for each observer show the difference between split-detection and confocal cone densities versus the mean of the paired densities for paired ROIs at 900 and 1800 μm. Observers 1 (A) and 3 (C) measured a higher cone density in confocal images (640 and 920 cones/mm2, P = 0.005 and P < 0.001, respectively) while observer 2 (B) measured a higher cone density in split detection images (550 cones/mm2, P < 0.001). Post-hoc evaluation of the outlier points in observers 1 and 3 were caused by erroneous cone identifications in confocal images (selection of rods) more often than erroneous cone identification in split-detection images. Observer 4 (D, naïve observer) measured a higher cone density in split detection images (2300 cones/mm2, P < 0.0001) and showed more variability in agreement between confocal and split-detection measures of cone density at the same retinal locations.
Table 1
 
Agreement in Cone Density Measurements Made by Three Expert Observers
Table 1
 
Agreement in Cone Density Measurements Made by Three Expert Observers
Table 2
 
Similarity in Cone Identifications Made by Three Expert Observers and 1 Naïve Observer
Table 2
 
Similarity in Cone Identifications Made by Three Expert Observers and 1 Naïve Observer
Table 3
 
Agreement in Cone Density Measurements Made by Three Expert Observers and 1 Naïve Observer
Table 3
 
Agreement in Cone Density Measurements Made by Three Expert Observers and 1 Naïve Observer
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×