To accomplish this, first, a StyleGAN model was trained as in Burlina et al.
14 using the same training dataset used by the baseline DLS. Pairs of (latent space vector
w, and image
I), were then generated by using the trained StyleGAN model, in inference mode (about 120,000
[w, I] tuples). Thereafter, a new retinal appearance DLS (RA-DLS), working in image space, was trained to classify between retinal images with markers of darker-skin versus lighter-skin individuals, using the extrapolated RA labels described earlier. This RA-DLS differed from the E-RA-DLS, in that the RA-DLS is trained on more images, made up of equal numbers of fundi from darker-skin and lighter-skin individuals. Note that StyleGAN includes two latent spaces, one with an input vector
Z of size 512, which is mapped via a fully connected network to a new latent style tensor
w of size 512 and replicated 16 times for each scale for a final size of 16 × 512. The later latent space representation
w was used for manipulation of factors of variations. Subsequently, a DLS for classification of retinal appearance, and operating in latent space
w, called the L-RA-DLS, was created as follows: a subset of the original 120,000 synthetic images, that were classified as healthy using the baseline B-DR-DLS, was sub-selected. We then inferred RA labels for this subset using RA-DLS, and those labels were used to train the L-RA-DLS. Finally, from the 120,000 synthetic images, a subset of 10,660 images that were classified as DR-referable by B-DR-DLS were sub-selected as starter images, and underwent the following latent space manipulation to generate new synthetic (RD) data. The corresponding latent space representations of these images,
w, were taken, and were subject a gradient descent method to accentuate the SoftMax value of the L-RA-DLS, thereby accentuating desired image markers for individuals with darker skin. The gradient descent moved along a trajectory in latent space that was able to maximally transform images to gain the desired markers, while still keeping the vasculature, as well as the disease lesions markers, unchanged. This is in contrast with a rectilinear or arbitrary trajectory that would have produced simultaneous changes in all image markers (corresponding to ethnicity, DR status, vasculature, and other factors of variations).