Our study is the first application of GAN-based algorithms on IVCM images, to the best of our knowledge. GAN-based algorithms have been studied to the segmentation of medical images from various imaging modalities, such as computed tomography,
61 magnetic resonance,
62 x-radiation,
63 and ultrasound imaging.
64 In ophthalmology, GANs are used for the segmentation of retinal vessels in fundus images.
53,65 Traditional segmentation approaches, such as graph-cut methods, have utilized pixel-wise correspondence over the decades with some significant caveats, including artifacts and leakages. GANs have a potential to bring out the best of the approaches, where the discriminator unit is acting as a shape regulator.
44 Although the effect of regularization is reported to be more eminent with compact shapes, unlike vessels or neurites, the discriminator's perception size might be set up from the whole image to a single pixel, as in the ImageGAN, PatchGAN, and PixelGAN examples. To regularize the network and overcome the collapses in the training process, Li and Shen
66 proposed a method that combines CGAN with AC-GAN and introduces a classifier loss term to their structure in the cell segmentation task. Although a pre-processing step is conducted in a study compared to ours, the classification loss is shown to improve the resulting segmentations. Unannotated images and the annotated ones may also be fed into the segmentation workflow where the former ones will help the training, leading to a more robust discrimination process and more accurate generated segmentation maps.
67 The main disadvantage of GAN-based algorithms is training separate networks together. During the training of the networks, GAN creates a zero-sum game to reach an equilibrium point between these networks, which is intuitively in contrast to the conventional algorithms, where the objective function is to minimize the loss of function.
68 Therefore, several issues are observed in training of GANs, as experienced in this study as well, including oscillations, mode collapses, diminishing, or exploding gradients.
69 In order to overcome and minimize these effects, a tedious fine-tuning approach must be followed. On the other hand, GANs have two significant advantages over U-Net or similar structures. Firstly, GANs let the generator networks produce near-realistic images. In medical imaging literature, this is efficiently utilized in image synthesis where the number of the images of a dataset is insufficient, as in many situations in medical image acquisition, to expand the datasets.
70 Second, GANs are more robust to artifacts and digital image noise sources in general with the help of the discriminator enforcing the generator to produce better outputs (e.g. segmentation maps in this case).
71