A novel computer algorithm was developed to improve the efficiency, objectivity, and accuracy of the haze-grading process. Like the acutance calculation, the algorithm needed optimization for the grading task as it produces a clarity score that depends on the spatial frequency band over which image power spectra are integrated (see Methods).
Figure 5A plots the clarity score computed for the nine reference images of the Miami scale when the upper limit of the frequency band was varied. It can be seen that lowering the high-frequency limit from 256 cycles/image (all frequencies) to 30 to 50 cycles/image had little impact on clarity scores, which increased systematically with image clarity, whereas further decreases in the upper limit reduced scores for all the reference images, with the clearest ones (levels 0–3) most affected. As a result, clarity scores became largely independent of image clarity.
Figure 5B plots the clarity score as the lower limit of the frequency band was varied. Scores decreased as the low-frequency limit was raised from one cycle/image (all frequencies) to approximately 10 cycles/image, but all reference images were similarly impacted. Further increases in the lower limit affected the clearest images (levels 0–3) preferentially, and clarity scores again became independent of image clarity. Taken together, the results indicate that vitreous haze is determined by information that mostly resides within the spatial frequency range of 10 to 50 cycles/image. The lower and upper limits of the integration band were therefore frozen at these values, and the algorithm was applied without further modification to the 120 test images.
Figure 5C compares computed haze scores with reader grades of vitreous haze using the Miami scale. Exact agreement was substantial (G1 and G2:
κ = 0.68 and 0.64, AC = 0.93 and 0.91), and within-one (
κ = 0.82 and 0.81) and within-two (
κ = 0.84 and 0.83) levels of agreement were almost perfect for both readers. The results are comparable to that between readers (
Fig. 3), with the remaining difference in
κ-value attributable to three main outliers (greater than two levels of disagreement) that were the same for both readers.
Figure 6A shows the outlying fundus images. The algorithm scored the leftmost image as less hazy and the rightmost two as more hazy than the readers did.
Figure 6B shows three other fundus images of similar quality based on reader grades that the algorithm scored within-two levels of agreement. From inspection of these and other test images, the outliers can be explained by the algorithm scoring the entire frame and the readers adjusting their grade based on image artifacts or islands of vessel clarity within the photographs. The agreement using the NIH scale for grading was substantial to almost perfect as well for one reader (exact:
κ = 0.64, AC = 0.87; within-one:
κ = 0.81; within-two:
κ = 0.82) and moderate to substantial for the second reader (exact:
κ = 0.58, AC = 0.77; within-one:
κ = 0.75; within-two:
κ = 0.78).