Following the methodology of Scholler et al.,
11 the first step was to use the dynamic intensity variation per pixel (
Fig. 2b) to compute the power spectrum density (PSD) using Welch's method for each pixel and then use an L1 normalization on each PSD as if it were a probability distribution. Then, the hue channel was computed as the mean frequency:
\begin{eqnarray}
{\rm{H}}={<} {f} {>} = {{PSD}_{norm}} \cdot f\end{eqnarray}
where
PSDnorm is the normalized PSD array,
f is the frequency, and “·” is the dot product (
Fig. 2c). The H values were strategically inverted, ensuring a more intuitive interpretation where red indicates rapid movement and blue denotes slowness. Additionally, the H values were rescaled to fall between 0 and 0.66. This adjustment is crucial, as the hue in the HSV color space is depicted as a wheel where values of 0 and 1 (i.e., 0° and 360°) correspond to the same color, red. Thus, this adjustment avoids redundancy in color representation and aids in the clear, intuitive differentiation of movement speeds in the final HSV image. Further, saturation was computed as the inverse of the normalized PSD bandwidth. As a consequence, the saturation channel carries the frequency bandwidth information. In practice, it is computed as the standard deviation of the frequencies (i.e., it corresponds to the frequency histogram width) as:
\begin{eqnarray}S = \sqrt {PS{{D}_{norm}} \cdot {{f}^2} - {{{\left( {PS{{D}_{norm}} \cdot f} \right)}}^2}} \end{eqnarray}
where
PSDnorm is the normalized PSD array,
f is the frequency, and “·” is the dot product (
Fig. 2d): the wider the spectrum, the lower the saturation. White noise has a broader bandwidth and will, therefore, appear grayish instead of colored. To obtain a single value per pixel of H and S, the area under the curve of the resulting curve (
Figs. 2c and
2d, respectively) was calculated and rescaled. Finally, the value (which corresponds to the perceived pixel intensity) was computed as the average of the running standard deviation of the mean pixel intensity over time with a window size set at 10% of the samples, in accordance with Scholler et al.
11 (
Fig. 2e). After computing the three channels, the dynamic image was transformed into the RGB color space for display purposes.