Published online Jan 14, 2024. doi: 10.3748/wjg.v30.i2.170
Peer-review started: November 8, 2023
First decision: December 7, 2023
Revised: December 15, 2023
Accepted: December 26, 2023
Article in press: December 26, 2023
Published online: January 14, 2024
Processing time: 65 Days and 3.7 Hours
Deep learning provides an efficient automatic image recognition method for small bowel (SB) capsule endoscopy (CE) that can assist physicians in diagnosis. However, the existing deep learning models present some unresolved challenges.
CE reading is time-consuming and complicated. Abnormal parts account for only a small proportion of CE images. Therefore, it is easy to miss the diagnosis, which affects the detection of lesions and assessment of their bleeding risk. Also, both image classification and object detection have made significant progress in the field of deep learning.
To propose a novel and effective classification and detection model to automatically identify various SB lesions and their bleeding risks, and label the lesions accurately, so as to enhance the diagnostic efficiency of physicians and their ability to identify high-risk bleeding groups.
The proposed model was a two-stage method that combined image classification with object detection. First, we utilized the improved ResNet-50 classification model to classify endoscopic images into SB lesion images, normal SB mucosa images, and invalid images. Then, the improved YOLO-V5 detection model was utilized to detect the type of lesion and the risk of bleeding, and the location of the lesion was marked. We constructed training and testing sets and compared model-assisted readings with physician readings.
The accuracy of the model constructed in this study reached 98.96%, which was higher than the accuracy of other systems using only a single module. The sensitivity, specificity, and accuracy of the model-assisted reading detection of all images were 99.17%, 99.92%, and 99.86%, which were significantly higher than those of the endoscopists’ diagnoses. The image processing time of the model was 48 ms/image, and the image processing time of the physicians was 0.40 ± 0.24 s/image (P < 0.001).
The deep learning model of image classification combined with object detection exhibits a satisfactory diagnostic effect on a variety of SB lesions and their bleeding risks in CE images, which enhances the diagnostic efficiency of physicians and improves their ability to identify high-risk bleeding groups.
We utilized a two-stage combination method and added multiple modules to identify normal SB mucosa images, invalid images, and various SB lesions (lymphangiectasia, lymphoid follicular hyperplasia, xanthoma, erosion, ulcer smaller than 2 cm, protruding lesion smaller than 1 cm, ulcer larger than 2 cm, protruding lesion larger than 1 cm, vascular lesions, and blood). The bleeding risk was evaluated and classified.