After data collection, the raw data were reformatted so that each sample in the dataset consisted of a set of predictors and a target value that could be used by the machine learning model. Among the biometry records, it was possible for individual eyes to have multiple preoperative and postoperative sets of biometry measurements. To take advantage of these records, preoperative and postoperative biometry records of the same eye were matched in a way that accounted for all possible combinations. An eye with
x preoperative records and
y postoperative records had
xy possible combinations. The inclusion of all possible preoperative and postoperative biometry record combinations represents a form of data augmentation, with the intention of increasing robustness to measurement variations while recognizing that the same eye can have varying lens thickness and preoperative anterior chamber depth due to natural cataract progression. At the end of data preprocessing (
Fig. 1, middle panel), a dataset of 4137 samples that involved 847 distinct patients was generated and used for the development of the machine learning model. Each sample consisted of (1) preoperative biometry: AL, central corneal thickness (CCT), ACD, crystalline lens thickness (LT), flat keratometry K1, steep keratometry K2,
\(Km = \frac{{K1 + K2}}{2}\), and horizontal white-to-white (WTW), (2) patient sex, (3) IOL power, and (4) postoperative ACD, where (1) to (3) were the predictors and (4) was the target variable in the machine learning model.
Corneal power is one of the most important features in both postoperative ACD prediction and postoperative refraction prediction in cataract surgery. However, corneal power measurement is unreliable in patients with prior corneal refractive surgery. To evaluate applicability of our method to patients with prior corneal refractive surgery, we examined how well our method performed when corneal power was not available.
We also studied the effect of IOL power in postoperative ACD prediction, because even though IOL power is directly associated with IOL thickness, which could in turn affect postoperative ACD, IOL power, to our knowledge, has not been considered in existing formulas.
In summary, we examined the performance of three classes of models where different subsets of variables were used as predictors: (1) Base, which used AL, CCT, ACD, LT, K1, K2, Km, WTW, and patient sex as predictors, (2) Base + IOL, which added IOL power to “base” as an additional feature, and (3) Base − K, which removed K1, K2, and Km from “Base”, using AL, CCT, ACD, LT, WTW, and patient sex as predictors.
LightGBM (2.2.3), which is a widely used framework for implementing the gradient boosted decision tree algorithm, was used to construct the machine learning model. During the training process, the training data were augmented through two methods (
Fig. 1, right panel) (1) IOL power augmentation and (2) data interpolation. The purpose of using IOL power augmentation was to improve the prediction performance by incorporating the relationship between IOL power and IOL thickness into the training data. During IOL power augmentation, the implanted IOL power (
IOLold) was replaced by
nIOL randomly selected IOL powers, and the ground truth postoperative ACD was adjusted based on the selected IOL powers (see
Supplementary Fig. S1). Specifically, for each distinct patient,
nIOL synthetic IOL powers (
IOLnew,1,
IOLnew,2, …) between [
IOLmin,
IOLmax] were selected, and the adjusted (new) postoperative ACD corresponding to each new IOL power was calculated as
\begin{eqnarray*}AC{D_{new}} = AC{D_{old}} - m\left( {IO{L_{new}} - IO{L_{old}}} \right),\end{eqnarray*}
where
m ∈ [0, 1] is a constant,
IOLmin ≥ 6,
IOLmax ≤ 30. The value of
IOLmin,
IOLmax,
m, and
nIOL were optimized through cross-validation. In data interpolation,
k samples were randomly picked, and the center of those
k samples was calculated by averaging each dimension of the predictor vector
X and the target value
y. Categorical variables were treated as continuous variables. The number of samples,
k, used to create each synthetic sample and the number of samples generated
, n, were optimized through cross-validation.