Abstract
Purpose:
The incidence of orbital blowout fractures (OBFs) is gradually increasing due to traffic accidents, sports injuries, and ocular trauma. Orbital computed tomography (CT) is crucial for accurate clinical diagnosis. In this study, we built an artificial intelligence (AI) system based on two available deep learning networks (DenseNet-169 and UNet) for fracture identification, fracture side distinguishment, and fracture area segmentation.
Methods:
We established a database of orbital CT images and manually annotated the fracture areas. DenseNet-169 was trained and evaluated on the identification of CT images with OBFs. We also trained and evaluated DenseNet-169 and UNet for fracture side distinguishment and fracture area segmentation. We used cross-validation to evaluate the performance of the AI algorithm after training.
Results:
For fracture identification, DenseNet-169 achieved an area under the receiver operating characteristic curve (AUC) of 0.9920 ± 0.0021, with an accuracy, sensitivity, and specificity of 0.9693 ± 0.0028, 0.9717 ± 0.0143, and 0.9596 ± 0.0330, respectively. DenseNet-169 realized the distinguishment of the fracture side with accuracy, sensitivity, specificity, and AUC of 0.9859 ± 0.0059, 0.9743 ± 0.0101, 0.9980 ± 0.0041, and 0.9923 ± 0.0008, respectively. The intersection over union (IoU) and Dice coefficient of UNet for fracture area segmentation were 0.8180 ± 0.0093 and 0.8849 ± 0.0090, respectively, showing a high agreement with manual segmentation.
Conclusions:
The trained AI system could realize the automatic identification and segmentation of OBFs, which might be a new tool for smart diagnoses and improved efficiencies of three-dimensional (3D) printing-assisted surgical repair of OBFs.
Translational Relevance:
Our AI system, based on two available deep learning network models, could help in precise diagnoses and accurate surgical repairs.
A total of 3016 orbital CT images (1997 fracture and 1019 non-fracture CT images) were obtained from the Second Norman Bethune Hospital of Jilin University. All patients were Asian, and the baseline demographic characteristics are shown in
Table 1. The fracture CT images were from patients with monocular OBFs. For the non-fracture group (total patients = 162 and total images = 1019), we selected several consecutive CT scans with complete bony walls for every patient. For the fracture group (total patients = 335 and total image = 1997), we selected continuous scans that showed the fracture areas for each patient. The fracture and non-fracture images were composed of dataset 1, which was used for the training and evaluation of fracture identifications. The fracture images were composed of dataset 2 for the training and evaluation of fracture side distinguishment and fracture area segmentation. The CT images with fractures in the database were independently judged by three experienced radiologists; the diagnosis was established, and the fracture areas were annotated when a consensus was reached. Another senior physician was invited to determine and annotate the fracture areas in the event of a disagreement among the three physicians. The direct and indirect signs of OBFs in the orbital CT images were annotated with the online tool LabelMe. Direct signs of OBFs were comprised of an interruption of the continuity of the orbital wall and a change in the contour of the orbit. The indirect signs included effusion in the adjacent sinus cavities, thickening, swelling of the extraocular muscles, and entrapment of the orbital contents. The fracture-type and side-specific distribution of CT images in the fracture group are shown in
Table 2. To minimize the computational cost, the target region from the orbit CT images was extracted automatically. We used the Open CV technology to realize automatically identify the region of interest (ROI) through the template matching program. This program could mark a rectangular region of the orbital region and extract the ROI. This procedure involved calculating the mean value of the target area after manual cropping. Then, the mean value was used as a template to realize automatic matching and cropping. Next, the input images were processed to 224 × 224 pixels for the training of DenseNet-169, and 128 × 256 pixels for the training of UNet; the pixel values were normalized to ensure that they ranged from 0 to 1.
Table 1. Baseline Demographic Characteristics of Each Group
Table 1. Baseline Demographic Characteristics of Each Group
Table 2. Fracture-Type and Side-Specific Distribution of CT Images in the Fracture Group
Table 2. Fracture-Type and Side-Specific Distribution of CT Images in the Fracture Group
In this study, we used random rotation (0 degrees to 359 degrees) as the method of data augmentation for the training set in dataset 1. At the time of transformation, each image in dataset 1 was randomly rotated by selecting an angle between 0 degrees and 359 degrees. The training set was trained with 100 epochs, which means each image experienced random rotations for 100 times. The transformed images were only used in the current step and not stored.
To avoid data leakage, we divided the data with patient labels to ensure that there was no patient overlap between the training and test sets. Additionally, k-fold cross-validation was used for the evaluation of the post-trained AI algorithm, whereby the data were randomly divided into k = 5 folds. In one cross-validation process, k-1 folds were used for training, and the rest of the folds were used for validation. The process was then repeated k times, using each of the k folds for validation. Compared to simply splitting the single dataset, cross-validation can effectively avoid bias in the test process.
We implemented the automatic identification of fracture images and the distinguishment of fracture sides using DenseNet-169 with the pre-trained ImageNet weight, and we used UNet for fracture area segmentation.
The feature layers with each dense block were reconnected to fully engage the combination of the shallow and deep features through DenseNet.
14 Due to the neural networks’ strong fitting abilities, the small scale of the training set can easily cause overfitting. UNet includes contracting (down-sampling) and expanding paths (up-sampling). In the process of down-sampling, a 3 × 3 valid convolution operation and rectified linear unit (ReLU) activation were repeated twice to reduce image resolution, and the key information was saved with a 2 × 2 maximum-pooling operation.
15 After each down-sampling, the layers of the image were increased, and the size was compressed. The expansion path of UNet gradually repaired the image details, precisely located the lesion site, and restored the feature map to the size of the input image. The expansion path also contained four blocks, each containing 3 × 3 deconvolutions, and the ReLU function. After each up-sampling operation, the feature map size was doubled, and the number of channels was halved. We divided the dataset according to the 8:1:1 ratio of the training set: validation set: test set. The several candidate parameters were selected through dynamic observation of the validation set, and the appropriate hyperparameters were further determined through the grid search method. The learning rate was 0.1 at the beginning. When it was observed that the accuracy was stable in the validation set, the learning rate was reduced gradually. Because of the size of the images in the database, we used the filter with a small size (3 × 3). The number of epochs was based on the dynamic observation of the neural network training results. The number of epochs was selected when we observed the accuracy was stable over a period of time with the number of epochs increasing. Then, we used the grid search method to determine the hyperparameters.
Figure 1 shows the architectures of DenseNet-169 and UNet.
Table 3 shows the parameters of the algorithms.
Table 3. DenseNet-169 and UNet Algorithm Parameters
Table 3. DenseNet-169 and UNet Algorithm Parameters
The training process was performed on the Xeon E5-2630v3@2.40 GHz server. The original UNet network model could not achieve effective segmentation of the fracture areas due to the irregular morphology and high variability of the fracture areas. To address this challenge, we created a new loss function by adding a constraint referring to the strategy of Srivastava to optimize the network.
16 \begin{eqnarray} L &=& \alpha\; {\rm{*}}\;\log \left( { - \mathop \sum \limits_{i = 1}^N \left[ {{y_i}\log \left( {{{\hat y}_i}} \right) + \left( {1 - {y_i}} \right)\log \left( {1 - {{\hat y}_i}} \right)} \right]} \right) \nonumber \\
&& + \beta\; {\rm{*}}\;\frac{1}{N}\mathop \sum \limits_{i = 1}^N \left| {({y_i} - {{\hat y}_i})} \right|\end{eqnarray}
The loss function of UNet is constructed as follows, yi is the real label extracted from the manually labeled segmentation map, \({\hat y_i}\;\)is the predicted label of the segmentation map generated by the model, α and β are the hyperparameters of the model, which take the values of 1 and 50, respectively, during the experiment, and N is the total number of pixels of the segmented image.
Supported by the National Natural Science Foundation of China (No. 82171053 and 81570864) and the Natural Science Foundation of Jilin Province (No. 20200801043GH and 20190201083JC). The funders had no role in the study design, data collection, analysis, decision to publish, or manuscript preparation.
Authors’ contributions: X.B. conceived and designed the experiments. X.B., B.F., and G.-Y.L. prepared the manuscript. X.B., X.Z., and Q.Z. performed the experiments. X.B. and X.Z. analyzed the data. L.W. optimized the algorithm and mended the manuscript.
Disclosure: X. Bao, None; X. Zhan, None; L. Wang, None; Q. Zhu, None; B. Fan, None; G.-Y. Li, None