The CNN that we developed to segment capillaries from AOSLO perfusion images was built in Python and based on U-Net, an open-source CNN initially used to segment cells in microscopy images.
37 The software and sample data described in this work are available on GitHub (https://github.com/porter-lab-software/AOVesselCNN). Two key steps in the U-Net architecture are (1) contraction for optimizing learned content and (2) a symmetric expansion for precision localization. The network described in this work was based on CNNs previously used to segment vasculature in fundus images
26 and OCTA images
29 and was subsequently altered to optimize the model for automatically segmenting AOSLO perfusion images.
The detailed architecture for our novel CNN is shown in
Table 1. The general pattern of including repeating groups of convolution, dropout, batch normalization, and pooling layers is a common feature of CNNs.
38 The CNN begins with a convolutional layer, which convolves an input, or image, with a filter of a specified kernel size. The convolutional response of the filter with the input is passed to the next layer. Dropout layers within the network prevent overfitting of network units to the training data by randomly removing units from the CNN. For this CNN, max-pooling layers were used to decrease computational demand and to increase the robustness of the network against small image distortions.
39 Max pooling takes the maximum value from a convolutional layer over a specified kernel and passes this response to the next layer. Batch normalization prevents overfitting and decreases training time by reducing internal covariate shift through normalization of the mean and variance statistics of the network units.
40 In the second half of the U-Net architecture structure, upsampling is used in place of pooling to connect the coarse outputs from the pooled convolutional layers back to the pixel segmentation.
41 The final fully connected layer uses a softmax activation function
42 to provide probability maps for each class (capillary, large vessel, background, image canvas), which can then be converted to binary maps using a global threshold determined by Otsu's method
19 to produce the final segmentation. For the purpose of computing capillary metrics, Otsu's method was applied only to the capillary class to segment those pixels that were capillaries from those that were not capillaries.
Our CNN contains important alterations from the base U-Net structure. First, we have expanded on previous CNN designs that have classified pixels into one of only two categories (i.e., vessel and non-vessel) by developing a 4-pixel type classification system (i.e., capillary, large vessel, background, and image canvas classes). Second, the size of the convolutional filter kernel was changed from a fixed 3 × 3 pixels to a varying size of 25 × 25, 15 × 15, and 10 × 10 pixels to accommodate for large vessels (defined to be >20 pixels in diameter) and capillaries (that typically range from 7 to 14 pixels in diameter). In addition, a weighting function was implemented using TensorFlow,
43 as the proportion of background and canvas pixels in the training set was much greater than the number of pixels classified as capillaries or large vessels. The weighting function more heavily weights capillary and large vessel pixels in inverse proportion to their percent representation in training set patches. Using TensorFlow, a weighted categorical cross-entropy loss function was implemented for the network training and adjusted to incorporate multiple pixel classes.