An architecture of our proposed machine learning model is shown in
Figure 1. The study focuses on XGBoost (a recently developed meta-algorithm) due to its reliable and superior performance compared with other classic machine learning methods.
17 Additionally, XGBoost is derived from the extreme gradient boosting, which falls under larger parallel tree boosting. The technique optimizes both the training loss and regularization of the model for the ensemble of the trees generated. In the study, we used XGBoost to predict the surgery option class
yi for the given input feature vectors
Xi = {
x1,
x2,⋅⋅⋅,
xN} including preoperative measurements and questionnaires. The training procedure was conducted via an additive strategy. Given a residue
i with
Xi, a tree ensemble model uses
K additive functions to predict the output value
\({\hat{y}_i}\) as follows:
\begin{eqnarray*}{\hat{y}_i} = \mathop \sum \limits_{k = 1}^K {f_k}\left( {{X_i}} \right),\ {f_k} \in F\end{eqnarray*}
where
fk(
Xi) denotes an independent tree structure with leaf scores of
Xi and
F denotes the space of functions containing all regression trees. The XGBoost algorithm introduces the regular function to control overfitting. The target function of XGBoost is expressed as follows:
\begin{eqnarray*}\min \left\{ {Obj} \right\} = \ {\rm{min}}\left\{ {\mathop \sum \limits_i l\left( {{y_i},{{\hat{y}}_i}} \right) + \mathop \sum \limits_t \Omega \left( {{f_t}} \right)} \right\}\end{eqnarray*}
\begin{eqnarray*}\Omega \left( {{f_t}} \right) = \gamma T + \ \frac{1}{2}\lambda \mathop \sum \limits_{j = 1}^T {\omega _j}^2\end{eqnarray*}
where
\(l( {{y_i},{{\hat{y}}_i}} )\) denotes a function of the difference between the ground truth value and predicted value of the
ith observation in the iteration. The Ω(
ft) term penalizes the complexity of the model where
T denotes the number of leaves and ω
j denotes the leaf node output in each subdecision tree model. The variables γ and λ are constants that control the degree of regularization. In the iterative learning process of XGBoost, the loss function is expanded into the Taylor second order series to quickly optimize the objective function while
L1 and
L2 regularizations are introduced to solve in a manner similar to penalty functions of LASSO and ridge regression.
18 Hence, the XGBoost model offers a more accurate prediction model and efficiently prevents overfitting.
19