Open Access
Artificial Intelligence  |   April 2023
A Hybrid System for Automatic Identification of Corneal Layers on In Vivo Confocal Microscopy Images
Author Affiliations & Notes
  • Ningning Tang
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Guangyi Huang
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Daizai Lei
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Li Jiang
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Qi Chen
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Wenjing He
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Fen Tang
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Yiyi Hong
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Jian Lv
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Yuanjun Qin
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Yunru Lin
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Qianqian Lan
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Yikun Qin
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Rushi Lan
    Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin, China
  • Xipeng Pan
    Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin, China
    Department of Radiology, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
  • Min Li
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Fan Xu
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Peng Lu
    Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region & Research Center of Ophthalmology, Guangxi Academy of Medical Sciences & Guangxi Key Laboratory of Eye Health & Guangxi Health Commission Key Laboratory of Ophthalmology and Related Systemic Diseases Artificial Intelligence Screening Technology, Nanning, China
  • Correspondence: Fan Xu, People's Hospital of Guangxi Zhuang Autonomous Region, No. 6 Taoyuan Rd, Nanning, Guangxi 530021, China. e-mail: oph_fan@163.com 
  • Peng Lu, Department of Ophthalmology, The People's Hospital of Guangxi Zhuang Autonomous Region #6 Taoyuan Rd, Nanning 530000, China. e-mail: 365989980@qq.com 
  • Footnotes
    *  NT, GH, DL, and LJ have contributed equally to this work.
Translational Vision Science & Technology April 2023, Vol.12, 8. doi:https://doi.org/10.1167/tvst.12.4.8
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Ningning Tang, Guangyi Huang, Daizai Lei, Li Jiang, Qi Chen, Wenjing He, Fen Tang, Yiyi Hong, Jian Lv, Yuanjun Qin, Yunru Lin, Qianqian Lan, Yikun Qin, Rushi Lan, Xipeng Pan, Min Li, Fan Xu, Peng Lu; A Hybrid System for Automatic Identification of Corneal Layers on In Vivo Confocal Microscopy Images. Trans. Vis. Sci. Tech. 2023;12(4):8. https://doi.org/10.1167/tvst.12.4.8.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose: Accurate identification of corneal layers with in vivo confocal microscopy (IVCM) is essential for the correct assessment of corneal lesions. This project aims to obtain a reliable automated identification of corneal layers from IVCM images.

Methods: A total of 7957 IVCM images were included for model training and testing. Scanning depth information and pixel information of IVCM images were used to build the classification system. Firstly, two base classifiers based on convolutional neural networks and K-nearest neighbors were constructed. Second, two hybrid strategies, namely weighted voting method and light gradient boosting machine (LightGBM) algorithm were used to fuse the results from the two base classifiers and obtain the final classification. Finally, the confidence of prediction results was stratified to help find out model errors.

Results: Both two hybrid systems outperformed the two base classifiers. The weighted area under the curve, weighted precision, weighted recall, and weighted F1 score were 0.9841, 0.9096, 0.9145, and 0.9111 for weighted voting hybrid system, and were 0.9794, 0.9039, 0.9055, and 0.9034 for the light gradient boosting machine stacking hybrid system, respectively. More than one-half of the misclassified samples were found using the confidence stratification method.

Conclusions: The proposed hybrid approach could effectively integrate the scanning depth and pixel information of IVCM images, allowing for the accurate identification of corneal layers for grossly normal IVCM images. The confidence stratification approach was useful to find out misclassification of the system.

Translational Relevance: The proposed hybrid approach lays important groundwork for the automatic identification of the corneal layer for IVCM images.

Introduction
Corneal disease is one of the leading causes of blindness in the world.1 More than 12 million people are affected by corneal blindness worldwide.2 Imaging technique is essential for correct diagnosis of corneal disease. In vivo confocal microscopy (IVCM) is an important noninvasive imaging modality for the diagnosis of corneal diseases.3 IVCM can visualize cellular components with a resolution up to 1 micron (µm).The high-resolution images of various corneal layers offer highly detailed information about the tissue morphology of cornea in vivo. Therefore, IVCM is widely used in assessing corneal diseases. 
By analyzing the corneal layers where the lesion is located in IVCM images, the extent of the lesion in the depth axis can be judged, which directly contributes to the establishment of clinical management strategies. Anatomically, the cornea consists of five layers, which are (from the anterior to posterior): epithelium (EP), Bowman's membrane (BM), stroma (S), Descemet's membrane (DM), and endothelium. Treatment modalities vary according to the extent of the lesion. For example, topical eye drop application is the preferred treatment option for lesions confined to corneal EP or superficial S; lamellar keratoplasty4 and intrastromal injection5,6 are optional treatments for patients with deep S infiltration; endothelial keratoplasty4 is applied for patients with endothelial diseases; and penetrating keratoplasty4 may be considered for the treatment of full-thickness corneal infection and opacity. Therefore, the anatomical localization of the lesion is of great importance. The different texture features in IVCM images allow ophthalmologists to distinguish different corneal layers, and thus infer the depth of the lesion. Manual analysis of these images, however, is extremely labor intensive, time consuming, requires expertise, and inherently subjective. Automation is, therefore, urgently needed and will facilitate standardized analysis among different centers. 
Machine learning methods have been proposed for the automatic recognition of specific corneal layers in IVCM images. Ruggeri and Pajaro7 developed a three-layer artificial neural networks-based model that reached recall rates of 100%, 84%, and 100% for the classification of sub-basal nerve plexus, S, and endothelium, respectively. Elbita et al.8 established an optimized artificial neural network model and obtained an overall accuracy of 97.22% in terms of EP, S, and endothelium classification. Sharif et al.9 demonstrated that combining classifier outputs by a weighted average approach could enhanced the classification performance. Although these results are encouraging, these studies were based on traditional (non–deep learning) machine learning methods and heavily relied on feature engineering, which is a cumbersome process. In addition, these studies involved small sample sizes (only hundreds of images), which limited the generalizability of the study results. Most important, part but not all of the corneal layers were investigated; therefore, the previous studies were not applicable for clinical practice. Collectively, novel approaches are needed to identify layers of complete cornea with high reliability. 
One of the greatest difficulties in the identification of the corneal layer lies in the recognition of the BM and DM. Both layers are acellular, basement membrane layers composed of collagen.10 Therefore, the BM and DM are indistinguishable morphologically owing to the lack of characteristic imaging features. Scanning depth information helps to locate the layers to some extent. However, even minor eye movements can affect the recording of scanning depth during image acquisition. Thus, it is not accurate to determine the corneal layers solely on the basis of scanning depth. A fusion strategy should be incorporated to maximize the usefulness of the information. In light of these factors, the present study aimed to explore a hybrid approach to identify the corneal layers accurately. We attempted to combine the image pixel information and scanning depth information to improve the accuracy. Hybrid strategies were implemented to integrate the data from heterogeneous information. First, two base classifiers, one analyzing the pixel information of IVCM images based on a convolutional neural network (CNN) algorithm and the other analyzing scanning depth information based on a K-nearest neighbor (KNN) algorithm, were constructed. Secondly, two hybrid strategies, namely weighted voting method and light gradient boosting machine (LightGBM) algorithm were used to fuse the results from the two base classifiers and obtain the final classification. Finally, a confidence stratification of the prediction results was conducted to help find out model errors and thus further improve classification accuracy. 
Methods
Dataset
The image dataset for this study included a total of 11,009 IVCM images obtained from 100 eyes (41 right eyes and 59 left eyes) of 100 patients (43 males and 57 females) between April 2020 and September 2021 in the Department of Ophthalmology, Guangxi Zhuang Autonomous Region People's Hospital. The inclusion criteria were that (1) the examined eyes had no apparent corneal lesions or had only mild changes such as dry eye syndrome, diabetic neuropathy, or pterygium; (2) all layers, including the EP, sub-basal nerve plexus, BM, S, DM, and endothelium, were scanned to obtain an image set of complete corneal structure; (3) patients were aged 18 years or older; and (4) if a patient had more than one examination during the time period, only the first was included. Exclusion criteria included (1) low-quality images, where morphologies of corneal layers were blurry and not clearly detectable owing to the failure of the camera focusing, inadequate exposure, or severe eye movement; (2) images with obvious corneal pathological changes; (3) the recognizable areas were less than one-quarter of the total image area; and (4) images containing areas from two or more adjacent layers. This study was conducted in compliance with the Declaration of Helsinki and approved by the ethics committee of The People's Hospital of Guangxi Zhuang Autonomous Region. Informed consent was waived because of the retrospective nature of the study and complete anonymized usage of images. Patient re-identification was impossible because the link between patient ID and study ID was removed upon data export. 
All images were taken using IVCM (HRT III/RCM Heidelberg Engineering, Heidelberg, Germany) with a Rostock Cornea Module, as previously described.11 A sequence scan of nonoverlapping areas of the central cornea were recorded and an average of 110 scans was performed in each eye. Each image was 384 × 384 pixels, which represented a coronal section measuring 400 µm × 400 µm with a resolution of approximately 1 µm. Each scan had a corresponding scan depth, which represented the perpendicular distance from the scanning plane to the reference plane at depth (Fig. 1). 
Figure 1.
 
Illustration of an IVCM image and its corresponding scanning depth. The IVCM image was acquired in the coronal plane (xy plane), showing the cell structure and morphology of a 400 µm × 400 µm region of cornea with up to 800 times magnification. In the image processing program, an IVCM image is a 384 × 384 matrix, of which each element corresponds with a pixel in the image. Each IVCM image was scanned at a particular depth. The scanning depth was measured as the distance from the reference plane to the scanning plane in the direction perpendicular to the coronal plane (z axis).
Figure 1.
 
Illustration of an IVCM image and its corresponding scanning depth. The IVCM image was acquired in the coronal plane (xy plane), showing the cell structure and morphology of a 400 µm × 400 µm region of cornea with up to 800 times magnification. In the image processing program, an IVCM image is a 384 × 384 matrix, of which each element corresponds with a pixel in the image. Each IVCM image was scanned at a particular depth. The scanning depth was measured as the distance from the reference plane to the scanning plane in the direction perpendicular to the coronal plane (z axis).
The dataset was split by eye to ensure that there was no overlap in eyes between the training and test sets. To this end, we split the data based on the time of data collection. The dataset was split into training and test sets in the ratio of 80:20 by eye. The splitting ratio 80:20 is commonly used in machine learning to make both the training and the test sets follow as similar a distribution as possible. Because the number of images varied between eyes, the proportion of testing images was not exactly equal to 20%. As a result, images collected before June 2021 were included in the training set, which comprised of 83.3% of the total images, and images collected between June and September 2021 were included in the test training set, which comprised of 16.7% of the total images. The training and test sets were mutually exclusive with respect to individual patients. Models were trained and internally tested using a five-fold cross-validation procedure on the training data. External validation was performed on the independent testing set to assess the model performance. 
Labelling and Preprocessing
The IVCM images were analyzed by three ophthalmologists with extensive clinical experience, with a ground truth of corneal layer for each image assigned when consensus was achieved by the three ophthalmologists. The dataset was presented to the ophthalmologists in the correct order, so that the relative focus depth information of the images was available during labelling. If there was an image without a consensus for the classification (generally owing to adjacent layers shown in the same image), the image was removed to minimize classifying bias. All included images were assigned to one of the six classes: (1) EP, (2) sub-basal nerve plexus (N), (3) BM, (4) S, (5) DM, and (6) endothelium. Representative images for each class of corneal layer are shown in Figure 2
Figure 2.
 
IVCM images of different corneal layers. (A) Epithelium, (B) sub-basal nerve plexus, (C) BM, (D) S, (E) DM, and (F) endothelium.
Figure 2.
 
IVCM images of different corneal layers. (A) Epithelium, (B) sub-basal nerve plexus, (C) BM, (D) S, (E) DM, and (F) endothelium.
This study classified the corneal layers as meticulously as possible to ensure that the models sufficiently learned the underlying mapping between features and categories, which guarantees the accuracy of prediction. However, the fine classification provides more options for users. The output classes can be combined according to the specific issues to be addressed. An example of simplified models of which N was not distinguished from BM is shown in the Supplementary Material, where the classification results for each layer are shown in Supplementary Figure S1, and the overall classification performance is shown in Supplementary Figure S2
The pixel values of the images were normalized into range [0, 1] before being input to the models. The original IVCM images were resized to a standard resolution of 224 × 224 pixels to match the input size of the CNN. The data augmentation technique12 was applied to balance the number of images of different categories. Horizontally flipped, vertically flipped, and horizontally and vertically flipped were performed to the images of BM and DM to increase the amount of data to four-fold. 
In addition to the preprocessing of pixel data, the absolute scanning depth data of each image was extracted using a template-matching algorithm. It should be kept in mind that two kinds of data obtained from the IVCM scans—the image pixel information and scanning depth information—were used for analysis. The image pixel data and the scanning depth were used as the input of the CNN classifier and the KNN classifier, respectively. 
The process of constructing the classification system consisted of two steps: constructing base classifiers and constructing hybrid systems. First, we developed two base classifiers, which were based on CNNs and KNNs, respectively. Second, we used two hybrid strategies, namely, a weighted voting method and the LightGBM algorithm, to aggregate the outputs of the two base classifiers and obtain the ultimate classification outcome. An overview of the model building process is shown in Figure 3
Figure 3.
 
An overview of the hybrid model building process. The process of constructing the classification system consisted of two primary steps: constructing base classifiers and constructing hybrid systems. First, the two base classifiers, namely CNN and KNN classifiers, were built. Second, two hybrid strategies, namely weighted voting method and LightGBM algorithm, were used to aggregate the outputs of the two base classifiers and obtain the ultimate classification outcome.
Figure 3.
 
An overview of the hybrid model building process. The process of constructing the classification system consisted of two primary steps: constructing base classifiers and constructing hybrid systems. First, the two base classifiers, namely CNN and KNN classifiers, were built. Second, two hybrid strategies, namely weighted voting method and LightGBM algorithm, were used to aggregate the outputs of the two base classifiers and obtain the ultimate classification outcome.
Base Classifier 1: Image Pixel-Based CNN
The first base classifier was based on a CNN algorithm, which classified the corneal layers using the pixel information of IVCM images. The CNN classifier automatically extracted discriminant pixel features from images by an end-to-end deep learning architecture.13 In our previous study, different CNN algorithms were compared and Inception-ResNet V2 showed a good performance in IVCM image classification.14,15 Thus, Inception-ResNet V2 was used to construct the CNN model. The core of the model is the Residual Inception blocks that combined two concepts: an additional factorization strategy is used to scale up the networks and shortcut connections are introduced to eliminate gradient degradation. 
The pixel data of the preprocessed images were used as input. The weights of the networks were fine tuned by backpropagation iteration. During training, the category cross-entropy loss metric was optimized through the stochastic gradient descent method. Five-fold cross-validation was used to optimize the hyperparameters of the models during training process. With this approach, training data were randomly split into five subsets. Each time, four subsets were used as a training set and one was withheld as the validation set. The process was iterated five times until each of the five subsets was used as a validation dataset once. The final models were trained on the entirety of the training dataset. 
Base Classifier 2: Scanning Depth-Based KNN
The second base classifier was based on a KNN algorithm, which classified corneal layers using the scanning depth data. The KNN is an instance-based learning method that classifies a sample by calculating its distance from all the others.16,17 First, the absolute scanning depth data of each image was extracted, as previously described. Although the scanning depth value was zeroed before each scan, the depth value of the first EP image might not be exactly zero owing to inevitable eye movement during the scanning process. Additionally, the distribution of the corneal layer relative to the absolute depth value might vary between individuals owing to the differences in corneal thickness. Therefore, relative depth was calculated for each image as follows:  
\begin{eqnarray} {D_{relative}} = \frac{{{D_{absolute{\rm{\;}}}} - {D_1}}}{{{D_{last{\rm{\;}}}} - {D_{1{\rm{\;}}}}}},\end{eqnarray}
(1)
where Drelative denotes the relative scanning depth value, Dabsolute denotes the absolute scanning depth value, D1  denotes the scanning depth value of the first image, and Dlast denotes the scanning depth value of the last image. 
Then we used the relative scanning depth values as the input to train the KNN model. Classification of an unlabeled image was computed from the majority vote of the nearest neighbors: the most common class among the neighbors was assigned to the query image. The nearest neighbors parameter k represents the number of neighbors chosen to assign the class to the query image. In practice, k is usually chosen to be odd. In this study, the value of parameter k was chosen based on cross-validation. The neighbors were weighted by the Euclidean distance that was used as the distance metric. Between two vectors,  vi and vj, the Euclidean distance was defined as the two-norm of the two vectors. (Equation 2) describes the special case of one-dimensional vectors, that is, scalar values, as in the present problem with a one-dimensional parameter space (the scan depth value):  
\begin{eqnarray} d\left( {{v_i},{v_j}} \right) = \sqrt[\;]{{{{\left( {{v_i} - {v_j}} \right)}^2}}}.\end{eqnarray}
(2)
 
Hybrid Approach 1: Weighted Voting
Weighted voting was a simple and intuitive method to fuse the decisions of the base classifiers.18,19 In this method, the prediction probabilities of CNN and KNN classifiers were penalized according to their performance by assigning to them a weight. The final decision was made by summing up the weighted votes of the two classifiers and choosing the class that received the highest score of predicted probability. In other words, the prediction class made by the weighted voting system, Cwv, is described as  
\begin{eqnarray} {C_{wv}} = \arg \max \left( P_{wv,1}, \ldots , P_{wv,i}, \ldots , P_{wv,6} \right)\end{eqnarray}
(3)
 
\begin{eqnarray} P_{wv,i} = P_{CNN,i} \times {w_{CNN}} + P_{KNN,i} \times {w_{KNN}}\end{eqnarray}
(4)
 
\begin{eqnarray} {w_{CNN}} + {w_{KNN}} = 1, \end{eqnarray}
(5)
where i = 1,  2, . . . …,  6; Pwv, i, PCNN, i, and PKNN, i represent the probabilities of the weighted voting system, CNN and KNN for the i-th class; wCNN and wKNN are the weights for the CNN and KNN classifiers, respectively. The best weights were determined based on five-fold cross-validation. Specifically, different values of weight were used to perform a five-fold cross-validation, and the area under the curve (AUC) was calculated in each fold. Afterward, an AUC mean was calculated by taking the average of the five AUCs. Finally, the optimal weight was determined as the weight with which the best AUC mean of the five-fold cross-validation was obtained. 
Hybrid Approach 2: LightGBM Stacking
LightGBM was another method that combined the decisions of CNN and KNN classifiers. We used the ground truth and probability maps of both CNN and KNN to train the LightGBM model. LightGBM is one of the state-of-the-art algorithms for machine learning,20 which speeds up the training process of a gradient boosting decision tree. LightGBM has been shown to achieve high efficiency and scalability, owing to its implementation with histogram-based memory, gradient-based one-side sampling, and exclusive feature bundling techniques. The leaf-wise construction of trees allows for a good balance between keeping the accuracy for learned decision trees and decreasing the overfitting problem. 
In this study, the boosting type was set at “gbdt” and the learning rate was set at 0.05. The other parameters were set as follows: max depth, 10; lambda L1, 0.2; lambda L2, 0.2; min split gain, 0.02; and metric, multi_error. 
All models were developed in Python programming language (Python Language Reference, version 3.7, Python Software Foundation) by PyCharm software (PyCharm Community Edition 2020.3.1, JetBrains). The KNN, weighted voting and LightGBM models were trained on a CPU Sever (Intel Core i5-8250U CPU@2.60). The CNN model was trained on an NVIDIA Tesla T4 Tensor Core GPU. 
Stratification of Prediction Confidence
We evaluated the confidence of the final classification results by analyzing the differences between the predicted labels of the CNN and the KNN. The greater the difference, the lower the prediction confidence and the more likely an error in classification. The corneal layer labels were coded with number according to their sequential positions. Essentially, the sub-basal nerve plexus layer and BM are different morphologies of the same anatomical layer. They differ by the morphology, but not by the depth. Because the two classes distributed in the same depth interval in KNN, the relative difference from other classes to the sub-basal nerve plexus layer/BM could be considered as the same. To facilitate the analysis, we encoded the sub-basal nerve plexus layer and BM into the same number. Therefore, EP, sub-basal nerve plexus/BM, S, DM, and endothelium were numbered 1, 2, 3, 4, and 5, respectively. 
The absolute value of the difference between the numbers of the predicted layers of CNN and KNN was denoted as Ndif. We set warning signs when Ndif ≥ 1, and partitioned the confidence warning ratings into low level when Ndif = 1, into medium level when Ndif= 2, and into high level when Ndif = 3 or 4. Finally, the proportions of each confidence levels in the total of correctly and incorrectly predicted samples were calculated. 
Performance Assessment
The classification task exhibited class imbalance owing to the small number of images of BM and DM. In the current state, there are still no standard performance measures for multiclass class-imbalance problems. We used weighted measures to evaluate the average performance. The weighted metrics offered a better estimate of overall performance where classes were imbalanced by computing the average of binary metrics in which each class's score was weighted by its presence in the dataset. 
The precision and recall were computed for each class. Precision is the fraction of true-positive samples among the samples predicted as positive, and recall is the fraction of true-positive samples among all samples of the class. Precision can be regarded as a measure of quality and recall as a measure of quantity. Then, the weighted precision and weighted recalls were calculated by weighting the precision and recall of each class by the number of samples from that class. A weighted F1 score was computed using the same weighting approach. The F1 score is a trade-off between precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The detailed calculation process was as follows:  
\begin{eqnarray} Precision_i = \frac{{T{P_i}}}{{T{P_i} + F{P_i}}}\end{eqnarray}
(6)
 
\begin{eqnarray} Precision_{weighted} = \frac{{\mathop \sum \nolimits_{i = 1}^C Precision_i \times {n_i}}}{N}\end{eqnarray}
(7)
 
\begin{eqnarray} Recal{l_i} = \frac{{T{P_i}}}{{T{P_i} + F{N_i}}}\end{eqnarray}
(8)
 
\begin{eqnarray} Recal{l_{weighted}} = \frac{{\mathop \sum \nolimits_{i = 1}^C Recal{l_i} \times {n_i}}}{N}\;\end{eqnarray}
(9)
 
\begin{eqnarray} {F_1}\;score = \frac{{2\; \times Precision \times Recall}}{{Precision + Recall}}\end{eqnarray}
(10)
 
\begin{eqnarray} {F_1}\;scor{e_{weighted}} = \frac{{\mathop \sum \nolimits_{i = 1}^C {F_1}\;scor{e_i} \times {n_i}}}{N}.\end{eqnarray}
(11)
C represents the number of classes; TPi, FPi, TNi, and FNi denote the number of true positives, false positives, true negatives, and false negatives from the i-th class, respectively; Precisioni and Recalli denote the precision and recall of i-th class; ni denotes the number of samples from i-th class; and N denotes the number of total samples in the test data. 
Receiver operating characteristic (ROC) curves, which plot the true-positive rate against the false-positive rate, were generated to evaluate the model performance on the external testing set. The ROC curves for each class were plotted in Figure 4 (A for the CNN classifier, B for the KNN classifier, C for the weighted voting hybrid system, and D for the LightGBM stacking hybrid system). The weighted average ROC curves, where the true-positive rate and false-positive rate were weighted by the prevalence of each class, were constructed for each approach in Figure 5. The area under the weighted average of ROC (weighted AUC) was calculated in one-vs-rest manner,21 which means that the average of the ROC AUC scores was computed for each class against all other classes and weighted by the prevalence as follows:  
\begin{eqnarray} weighted\;AUC = \mathop \sum \limits_{i = 1}^C {w_i} \times AU{C_i},\end{eqnarray}
(12)
where wi is the ratio of the number of samples from the i-th class (ni) divided by the number of total samples (N); AUCi represents the AUC of the i-th class. The 95% confidence intervals (95% CIs) were calculated for each weighted AUC. A random classifier would have a weighted AUC of 0.5 and a perfect classifier would have a weighted AUC of 1. 
Figure 4.
 
The ROC curves and the AUC values of each layer for different approaches.
Figure 4.
 
The ROC curves and the AUC values of each layer for different approaches.
Figure 5.
 
The weighted average ROC curves and weighted AUC of the base classifiers and hybrid systems. The ROC curves of each layer were weighted into one ROC curve to summarize the overall test performance for each model, from which the weighted AUC could be calculated.
Figure 5.
 
The weighted average ROC curves and weighted AUC of the base classifiers and hybrid systems. The ROC curves of each layer were weighted into one ROC curve to summarize the overall test performance for each model, from which the weighted AUC could be calculated.
Results
A retrospective series of 11,009 images from 100 patients was collected initially. All subjects were of Chinese Han ethnicity so that they could communicate well and cooperate with the IVCM examination. A total of 3052 images were excluded, of which 1316 images from 13 eyes were completely excluded because images of partial corneal layers were missing and the scanning sequence was incomplete, and 1736 images from the rest eyes were excluded individually based on the exclusion criteria noted elsewhere in this article. Finally, a total of 7957 images were included for model training and testing. The original training set consisted of 6624 images before image augmentation, which included 1216 images for EP, 1292 for N, 178 for BM, 2972 for S, 115 for DM, and 851 for EN. The data augmentation was performed to the images of BM and DM to balance training data; therefore, the number of images increased to 712 for BM and 460 for DM after image augmentation. The external testing set consisted of 1333 images, including 224, 178, 42, 701, 29, and 159 for EP, N, BM, S, DM, and EN, respectively. External validation results were reported as the measures of the realistic performance of CNN classifier, KNN classifier, weighted voting hybrid system, and LightGBM stacking hybrid system (Table). 
Table.
 
Performance of the Base Classifiers and Hybrid Systems
Table.
 
Performance of the Base Classifiers and Hybrid Systems
Performance of Base Classifiers
Two types of base classifiers were used to classify the corneal layers. Considering the CNN first, this model learned the relationship between the raw pixel features and the class labels through end-to-end learning. As shown in Figure 4, CNN performed better in classifying EP, N, S, and EP layers compared with the BM and DM layers. The CNN classifier achieved a weighted AUC of 0.9542 (95% CI, 0.9359–0.9723) in the multiclassification of corneal layers. The weighted precision, weighted recall, and weighted F1 score were 0.8920, 0.8919, and 0.8893, respectively. 
The other base classifier is KNN, which learned the relationship between the scanning depth information and the corneal layers through nonparametric learning. We noted that the relative scanning depth revealed differential peak distribution among the corneal layers, with the S showed the widest range (Fig. 6). However, there was some degree of overlap between adjacent layers, which limited the KNN classifier's performance. Results showed that the KNN classifier exhibited a worse AUC for each layer compared with the CNN classifier, with BM and DM layers having the worst outcomes. The KNN classification performances were 0.9309 (95% CI: 0.9096-0.9521) for weighted AUC, 0.7447 for weighted precision, 0.7607 for weighted recall, and 0.7483 for weighted F1 score. 
Figure 6.
 
The relationship between the scanning depth information and the corneal layers. (A) shows the relative class frequency among the labeled samples of each corneal layer, and (B) shows the range of the relative scanning depth in each corneal layer class, where the ranges of distribution of adjacent layers, to some extent, overlap with each other.
Figure 6.
 
The relationship between the scanning depth information and the corneal layers. (A) shows the relative class frequency among the labeled samples of each corneal layer, and (B) shows the range of the relative scanning depth in each corneal layer class, where the ranges of distribution of adjacent layers, to some extent, overlap with each other.
Performance of Hybrid Systems
Both hybrid systems outperformed the base classifiers. As shown in Figures 4 and 5, the AUC of each layer for the two hybrid systems substantially improved compared to that for the base classifiers. The weighted voting hybrid approach is fairly straightforward, which computed a weighted linear combination of the results of CNN and KNN classifiers. Although simple, the strategy achieved good results in this study. For this condition, the weighted AUC was 0.9841 (95% CI: 0.9729–0.9948), the weighted precision was 0.9096, the weighted recall was 0.9145, and the weighted F1 score was 0.9111. 
The LightGBM stacking hybrid approach was relatively complex, which integrated the results of the CNN and KNN classifiers using a machine learning method. Its performance was comparable with that of a weighted voting hybrid approach and superior to that of two base classifiers. The weighted AUC, weighted precision, weighted recall, and weighted F1 score were 0.9794 (95% CI: 0.9670–0.9917), 0.9039, 0.9055, and 0.9034, respectively. 
Prediction Confidence
The confidence levels of the prediction results of the hybrid systems were evaluated by comparing the differences between the results of two base classifiers. By this measure, more than one-half of the misclassified samples were found and generated a confidence warning, whereas only a small part of correctly classified images triggered the warning. The proportions of images with different levels of confidence warning in the correctly and incorrectly classified images are shown in pie charts (Fig. 7). Of the images incorrectly classified by the weighted voting hybrid system, 24.56% (n = 28), 17.54% (n = 20), 22.81% (n = 26), and 35.09% (n = 40) of the images were rated as high level, medium level, low level, and no warning, respectively. The overall warning rate was 64.91% for the misclassified samples. Of the images correctly classified by the weighted voting hybrid system, 0% (n = 0), 3.84% (n = 48), 15.26% (n = 186), and 80.80% (n = 985) of the images were rated as high level, medium level, low level, and no warning, respectively. The warned samples account for 19.20% of the correctly classified samples. 
Figure 7.
 
The proportions of images with different levels of confidence warning in the correctly and incorrectly classifier images.
Figure 7.
 
The proportions of images with different levels of confidence warning in the correctly and incorrectly classifier images.
The confidence evaluation results were slightly worse in LightGBM stacking hybrid system but remain effective. Of the images incorrectly classified by the LightGBM stacking hybrid system, 21.43% (n = 27), 15.87% (n = 19), 22.22% (n = 26), and 40.48% (n = 47) of the images were evaluated as high-level, medium-level, low-level, and no warning, respectively. The overall warning rate was 59.52% for misclassified samples. Of the images correctly classified by the LightGBM stacking hybrid system, 0.08% (n = 1), 3.98% (n = 49), 15.24% (n = 186), and 80.70% (n = 978) of the images were evaluated as high level, medium level, low level, and no warning, respectively. The warned samples account for 19.30% of the correctly classified samples. 
Discussion
In this study, we presented hybrid approaches to classify corneal layers accurately. The CNN was used to extract the discriminative pixel features in IVCM images, and the KNN was used to learn the relationship between the scanning depth and the corneal layers. Then, hybrid systems were established to integrate the results of both the CNN and the KNN to classify the corneal IVCM images into six categories corresponding to different anatomical hierarchies. Additionally, the confidence was rated for each prediction result. Both hybrid approaches obtained better performance compared with either the CNN or the KNN alone, and the confidence stratification strategy offered excellent advantages for the clinical applications. 
The proposed system was robust and enabled to deal with all corneal layers, whereas none of previous studies involved the identification of BM and DM owing to the difficulty of distinction. The system could help to determine the corneal layer of grossly normal images. This process, in turn, allows us to infer the layers of the remaining abnormal images and determine the extent of corneal lesions indirectly. The present study is a foundation on which future studies can build and improve. Further studies will be conducted to combine the above method and anomaly detection algorithms for lesion detection to achieve fully automated localization of corneal lesions. 
The hybrid approach has several methodological strengths. On the one hand, the system analyzed heterogeneous data of both structured and unstructured information; therefore, more features of multidimensional data were used to implement the classification. On the other hand, the proposed method combined the advantages of both the CNN and the KNN. The CNN was characterized by a high accuracy. However, once an error occurred, the error was irregular and the image might be misclassified into any layers. In contrast, although the accuracy of the KNN was relatively low, the misjudged layer of the KNN tended to be restricted to the adjacent layers. By combining the two strategies, this study further improved the classification accuracy and allowed prediction confidence to be evaluated. 
Two hybrid methods were developed in this study, and the results showed that the performance indices of weighted voting approach were slightly better than that of the LightGBM. The possible reason is that the feature number of decision fusion was small; thus, a weighted voting model might fit the data better. It should be noted that the difference between the two hybrid approaches was, however, rather small. There was some overlap of the 95% CIs of the two hybrid approaches. Therefore, the difference might not reach statistical significance. Future research with larger numbers of features and samples is needed to further compare the models’ performance. 
This study innovatively conducted a comprehensive analysis of the results of a CNN and a KNN to stratify prediction confidence. The difference between the two models contributed to finding potential misclassification errors and generating the confidence warning. The warning alerted ophthalmologists of the need to manually check the model results. This process ensured that an appropriate correction can be implemented and thus prevent misdiagnosis. The results showed that samples triggering a confidence warning accounted for a major percentage of misclassified samples, but only account for a small part of correctly classified samples, which means that the accuracy might be improved further by introducing effective warning correction procedures without a marked increase in workload. 
The confidence stratification method was convenient without the need for cumbersome calculation or additional models. Approximately 6 of 10 errors were found. The remaining misclassified images did not trigger a warning because the CNN and the KNN made a same mistake. A majority of these images were classified incorrectly into the adjacent layer by the KNN because their depth was in the overlapping range of adjacent layers. Because the CNN is a black box, the underlying reasons for the misclassification could not be explained adequately. We speculated that the errors of CNN might be caused by the atypical morphology of these images. Reassuringly, most of the misclassified images that did not trigger a confidence warning were the initial or last few images; hence, these misclassifications might not have a discernible impact on the overall localization. In contrast, no more than 20% of correct images triggered a confidence warning. These images were classified wrongly by either the CNN or the KNN, but the wrong prediction probability was not very high. Thus, these images could be correctly classified ultimately in a hybrid model. None of these images lead to a high-level warning. In other words, all images with a high-level warning were indeed misclassified. Therefore, vigilance for the confidence warning, especially the high-level warning, might help in the prompt recognition of model errors. 
There were several limitations to our study. First, images with severe lesions were not included in the study. The morphological structures of these images might be severely impaired and even illegible; therefore, the models of this study cannot be generalized directly to these images. The classification of these images requires specialized models with more complex rules and more diverse training sets, which is a priority for our future research. Second, we successfully localized the corneal layer in the depth axis, but could not locate the extent of the lesion in cross-section of the cornea. The high-magnification IVCM images provide local details, but will come at the cost of a loss in information of the overall lesions. Other types of images such as slit-lamp photography are needed to develop multimodal methods to achieve three-dimensional localization of the lesion. Third, because of the rigorous inclusion and exclusion criteria, the study suffered from a small sample size. Given the high scanning resolution of IVCM, there were likely some similar images of the same eye included in the same training or testing dataset. This factor may have biased our results toward more favorable outcomes and decreased the generalizability of the models. Fourth, the single-center design and the retrospective nature of the study are obvious limitations of the present study. A larger scale, prospective, multicenter study is needed to validate the performance of the algorithms and investigate the reproducibility. 
Conclusions
The proposed hybrid approach could effectively integrate the scanning depth information and pixel information of IVCM images, allowing for the accurate identification of corneal layers for grossly normal IVCM images. The confidence stratification approach was useful to find out misclassification of the system and thus further improve the accuracy. Based on these, iterative model will be built for images with pathology to help localize corneal lesions in the future. 
Acknowledgments
The authors gratefully thank all persons who helped in this research. 
Supported by Guangxi Science and Technology Base and Talent Special Fund (Grant numbers [GuikeAD22035011]), the Natural Science Foundation of Guangxi Zhuang Autonomous Region (Grant numbers [No.2020GXNSFBA159015]), Guangxi Clinical Ophthalmic Research Center (Grant numbers [GuikeAD19245193]), Guangxi Zhuang Autonomous Region Health Committee's Self-financing Project (Grant numbers [Z20201322]), and Guangxi Science and Technology Base and Talent Special Fund (Grant numbers [GUI KE AD20297030]). 
Disclosure: N. Tang, None; G. Huang, None; D. Lei, None; L. Jiang, None; Q. Chen, None; W. He, None; F. Tang, None; Y. Hong, None; J. Lv, None; Y. Qin, None; Y. Lin, None; Q. Lan, None; Y. Qin, None; R. Lan, None; X. Pan, None; M. Li, None; F. Xu, None; Peng Lu, None 
References
Resnikoff S, Pascolini D, Etya'ale D, et al. Global data on visual impairment in the year 2002. Bull World Health Organ. 2004; 82(11): 844–851. [PubMed]
Gain P, Jullienne R, He Z, et al. Global survey of corneal transplantation and eye banking. JAMA Ophthalmol. 2016; 134: 167–173, doi:10.1001/jamaophthalmol.2015.4776. [CrossRef] [PubMed]
Zhivov A, Stachs O, Kraak R, Stave J, Guthoff RF. In vivo confocal microscopy of the ocular surface. Ocul Surf. 2006; 4(2): 81–93, doi:10.1016/s1542-0124(12)70030-7. [CrossRef] [PubMed]
Tan DT, Dart JK, Holland EJ, Kinoshita S. Corneal transplantation. Lancet. 2012; 379(9827): 1749–1761, doi:10.1016/S0140-6736(12)60437-1. [CrossRef] [PubMed]
Prakash G, Sharma N, Goel M, Titiyal JS, Vajpayee RB. Evaluation of intrastromal injection of voriconazole as a therapeutic adjunctive for the management of deep recalcitrant fungal keratitis. Am J Ophthalmol. 2008; 146(1): 56–59, doi:10.1016/j.ajo.2008.02.023. [CrossRef] [PubMed]
Kalaiselvi G, Narayana S, Krishnan T, Sengupta S. Intrastromal voriconazole for deep recalcitrant fungal keratitis: a case series. Br J Ophthalmol. 2015; 99(2): 195–198, doi:10.1136/bjophthalmol-2014-305412. [CrossRef] [PubMed]
Ruggeri A, Pajaro S. Automatic recognition of cell layers in corneal confocal microscopy images. Comput Methods Programs Biomed. 2002; 68(1): 25–35, doi:10.1016/s0169-2607(01)00153-5. [CrossRef] [PubMed]
Elbita A, Qahwaji R, Ipson S, Sharif MS, Ghanchi F. Preparation of 2D sequences of corneal images for 3D model building. Comput Methods Programs Biomed. 2014; 114(2): 194–205, doi:10.1016/j.cmpb.2014.01.009. [CrossRef] [PubMed]
Sharif MS, Qahwaji R, Ipson S, Brahma A. Medical image classification based on artificial intelligence approaches: a practical study on normal and abnormal confocal corneal images. Applied Soft Computing. 2015; 36: 269–282, doi:10.1016/j.asoc.2015.07.019. [CrossRef]
Agarwal A, Dua HS, Narang P, et al. Pre-Descemet's endothelial keratoplasty (PDEK). Br J Ophthalmol. 2014; 98(9): 1181–1185, doi:10.1136/bjophthalmol-2013-304639. [CrossRef] [PubMed]
Levine H, Hwang J, Dermer H, et al. Relationships between activated dendritic cells and dry eye symptoms and signs. Ocul Surf. 2021; 21: 186–192, doi:10.1016/j.jtos.2021.06.001. [CrossRef] [PubMed]
Mikołajczyk A, Grochowski M. Data augmentation for improving deep learning in image classification problem. 2018 International Interdisciplinary PhD Workshop (IIPhDW). Swinoujscie, Poland. May 9–12, 2018;117–122, doi: 10.1109/IIPHDW.2018.8388338.
Traore BB, Kamsu-Foguem B, Tangara F. Deep convolution neural network for image recognition. Ecological Informatics. 2018; 48: 257–268, doi:10.1016/j.ecoinf.2018.10.002. [CrossRef]
Xu F, Qin Y, He W, et al. A deep transfer learning framework for the automated assessment of corneal inflammation on in vivo confocal microscopy images. PLoS One. 2021; 16(6): e0252653, doi:10.1371/journal.pone.0252653. [CrossRef] [PubMed]
Xu F, Jiang L, He W, et al. The clinical value of explainable deep learning for diagnosing fungal keratitis using in vivo confocal microscopy images. Front Med (Lausanne). 2021; 14(8): 797616, doi:10.3389/fmed.2021.797616.
Sánchez J S, Mollineda R A, Sotoca J M. An analysis of how training data complexity affects the nearest neighbor classifiers. Pattern Analysis and Applications, 2007; 10(3): 189–201, doi:10.1007/s10044-007-0061-2. [CrossRef]
Raniszewski M. Sequential reduction algorithm for nearest neighbor rule. International Conference on Computer Vision and Graphics. Warsaw, Poland. September 20–22, 2010: 219–226, doi:10.1007/978-3-642-15907-7_27.
Hüllermeier E, Vanderlooy S. Combining predictions in pairwise classification: an optimal adaptive voting strategy and its relation to weighted voting. Pattern Recogn. 2010; 43: 128–142, doi:10.1016/j.patcog.2009.06.013. [CrossRef]
Kuncheva LI, Rodríguez JJ. A weighted voting framework for classifiers ensembles. Knowledge and Information Systems, 2014; 38(2): 259–275, doi:10.1007/s10115-012-0586-6. [CrossRef]
Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30 (NIPS 2017), 2017; 30: 3146–3154.
Nakas CT, Yiannoutsos CT. Ordered multiple-class ROC analysis with continuous measurements. Stat Med. 2004; 23(22): 3437–3449, doi:10.1002/sim.1917. [CrossRef] [PubMed]
Figure 1.
 
Illustration of an IVCM image and its corresponding scanning depth. The IVCM image was acquired in the coronal plane (xy plane), showing the cell structure and morphology of a 400 µm × 400 µm region of cornea with up to 800 times magnification. In the image processing program, an IVCM image is a 384 × 384 matrix, of which each element corresponds with a pixel in the image. Each IVCM image was scanned at a particular depth. The scanning depth was measured as the distance from the reference plane to the scanning plane in the direction perpendicular to the coronal plane (z axis).
Figure 1.
 
Illustration of an IVCM image and its corresponding scanning depth. The IVCM image was acquired in the coronal plane (xy plane), showing the cell structure and morphology of a 400 µm × 400 µm region of cornea with up to 800 times magnification. In the image processing program, an IVCM image is a 384 × 384 matrix, of which each element corresponds with a pixel in the image. Each IVCM image was scanned at a particular depth. The scanning depth was measured as the distance from the reference plane to the scanning plane in the direction perpendicular to the coronal plane (z axis).
Figure 2.
 
IVCM images of different corneal layers. (A) Epithelium, (B) sub-basal nerve plexus, (C) BM, (D) S, (E) DM, and (F) endothelium.
Figure 2.
 
IVCM images of different corneal layers. (A) Epithelium, (B) sub-basal nerve plexus, (C) BM, (D) S, (E) DM, and (F) endothelium.
Figure 3.
 
An overview of the hybrid model building process. The process of constructing the classification system consisted of two primary steps: constructing base classifiers and constructing hybrid systems. First, the two base classifiers, namely CNN and KNN classifiers, were built. Second, two hybrid strategies, namely weighted voting method and LightGBM algorithm, were used to aggregate the outputs of the two base classifiers and obtain the ultimate classification outcome.
Figure 3.
 
An overview of the hybrid model building process. The process of constructing the classification system consisted of two primary steps: constructing base classifiers and constructing hybrid systems. First, the two base classifiers, namely CNN and KNN classifiers, were built. Second, two hybrid strategies, namely weighted voting method and LightGBM algorithm, were used to aggregate the outputs of the two base classifiers and obtain the ultimate classification outcome.
Figure 4.
 
The ROC curves and the AUC values of each layer for different approaches.
Figure 4.
 
The ROC curves and the AUC values of each layer for different approaches.
Figure 5.
 
The weighted average ROC curves and weighted AUC of the base classifiers and hybrid systems. The ROC curves of each layer were weighted into one ROC curve to summarize the overall test performance for each model, from which the weighted AUC could be calculated.
Figure 5.
 
The weighted average ROC curves and weighted AUC of the base classifiers and hybrid systems. The ROC curves of each layer were weighted into one ROC curve to summarize the overall test performance for each model, from which the weighted AUC could be calculated.
Figure 6.
 
The relationship between the scanning depth information and the corneal layers. (A) shows the relative class frequency among the labeled samples of each corneal layer, and (B) shows the range of the relative scanning depth in each corneal layer class, where the ranges of distribution of adjacent layers, to some extent, overlap with each other.
Figure 6.
 
The relationship between the scanning depth information and the corneal layers. (A) shows the relative class frequency among the labeled samples of each corneal layer, and (B) shows the range of the relative scanning depth in each corneal layer class, where the ranges of distribution of adjacent layers, to some extent, overlap with each other.
Figure 7.
 
The proportions of images with different levels of confidence warning in the correctly and incorrectly classifier images.
Figure 7.
 
The proportions of images with different levels of confidence warning in the correctly and incorrectly classifier images.
Table.
 
Performance of the Base Classifiers and Hybrid Systems
Table.
 
Performance of the Base Classifiers and Hybrid Systems
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×