Cone photoreceptors are vital for human vision. Specifically, these cells serve daylight and color vision. Diseases such as Stargardt's disease,
1 retinitis pigmentosa,
2 choroideremia,
3 and macular degeneration
4 are characterized by the loss of photoreceptors leading to impaired vision. A way to image the photoreceptor array is using an adaptive optics scanning laser ophthalmoscope (AOSLO). Two common variants of the AOSLO imaging modality are confocal and split detector, each providing slightly different information on photoreceptor structure.
5,6
Regardless of the method use to acquire the images, the cones must be located within the image to create quantifiable information and extract metrics, such as cone density and spacing and packing arrangements. Given the high density of cones within the image, manual cone identification can be time-consuming and inconsistent. Several automatic or semi-automatic methods have been proposed to create a faster and more consistent cone detection process. Some methods are based on standard image analysis techniques: image histogram analysis,
7 multi-scale modelling and normalized cross-correlation,
8 a circular Hough transform,
9 and multiscale circular voting.
10 In recent years, machine learning methods have also been applied to this problem. Cunefare et al.
11 proposed a so-called “patch-based” method involving generating a probability map through a sliding window convolutional neural network (CNN) and then postprocessing this probability map to locate cone positions. The CNN, which works on a small window of the entire image, generates a binary (two-class) classification of the image as either the patch centered on a cone or not centered on a cone. By moving the window along different sections of the image, the probability map is obtained. The postprocessing method, which is needed to extract peaks from the probability map (cone locations), contains several steps and several tunable parameters. Heisler et al.
12 investigated the use of transfer learning on the network of Cunefare et al.
11 to enable classifications of previously unseen data collected from a different imaging modality (AO scanning laser ophthalmoscope). Davidson et al.
13 proposed a method using a multidimensional Recurrent Neural Network (RNN), which generates a probability map for the entire image in a single set of computations.
Patch-based and CNNs have commonly been applied to ophthalmic medical images, such as retinal segmentation or classification, and provide state-of-the-art performance in these areas.
14–16 Fully Convolutional Networks (FCNs) are an extension of CNNs.
17 The main benefit of an FCN is the ability to process the entirety of an image at once and provide a per-pixel probability map. FCNs are commonly used for object segmentation, region labeling, or other per-pixel operations
17 and have been used for geographic atrophy segmentation in retinal tomography images.
18 FCNs have been commonly used in medical image processing for problems such as retinal layer segmentation,
19 segmentation of neuronal structures,
20 and cell detection.
21 For per-pixel operations, FCNs are commonly quicker than a patch-based CNN or RNN, as the FCN only passes over the data once, whereas data are repeatedly evaluated in the case of a patch-based CNN and multiple recurrent loops increase the number of operations in the case of an RNN.
22
In this work, we propose the application on a FCN for cone detection in confocal and split detector AOSLO images. We use a previously published method, based on a patch-based technique,
11 as a baseline to assess the benefit that a FCN approach may have in this particular problem.