J Shanghai Jiaotong Univ Sci ›› 2025, Vol. 30 ›› Issue (3): 521-534.doi: 10.1007/s12204-023-2671-2
收稿日期:
2022-11-14
接受日期:
2023-02-27
出版日期:
2025-06-06
发布日期:
2025-06-06
司丙奇1,2,逄晨曦3,王志武1,2,姜萍萍1,2,颜国正1,2
Received:
2022-11-14
Accepted:
2023-02-27
Online:
2025-06-06
Published:
2025-06-06
摘要: 结直肠癌是最常见的癌症,死亡率第二。息肉病变是结直肠癌的前兆症状。息肉的发现和切除可有效降低患者早期的死亡率。然而,内窥镜检查过程中会产生大量的图像,这将大大增加医生的工作量,并且长期的机械筛选内镜图像也会导致高误诊率。针对计算机辅助诊断模型在息肉检测任务中严重依赖计算能力的问题,我们提出了一种基于YOLOv5算法的轻量级模型,坐标注意力-YOLOv5-Lite-Prune;这个模型不同于现有研究中提出的最新方法,例如更快的基于区域的卷积神经网络、YOLOv3、YOLOv4和单次多边框检测,这些方法将目标检测模型或其变体直接应用于预测任务而不进行任何轻量级处理。本文模型的创新点如下:首先,引入轻量级的EfficientNetLite作为新的特征提取网络;其次,采用深度可分卷积及其改进模块,采用不同的注意机制取代检测头结构中的标准卷积;然后,利用α-IoU损失函数提高模型的精度和收敛速度;最后,利用剪枝算法压缩模型大小。我们的模型有效地减少了参数的数量和计算复杂度,并且没有明显的精度损失。因此,该模型可以成功部署在嵌入式深度学习平台上,并以每秒30帧以上的速度检测息肉,这意味着该模型摆脱了深度学习模型必须依赖高性能服务器的限制。
中图分类号:
. 用于内窥镜图像息肉检测的实时轻量级卷积神经网络[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(3): 521-534.
Si Bingqi, Pang Chenxi, Wang Zhiwu, Jiang Pingping, Yan Guozheng. Real-Time Lightweight Convolutional Neural Network for Polyp Detection in Endoscope Images[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(3): 521-534.
[1] THANIKACHALAM K, KHAN G. Colorectal cancer and nutrition [J]. Nutrients, 2019, 11(1): 164. [2] SUNG H, FERLAY J, SIEGEL R L, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA: A Cancer Journal for Clinicians, 2021, 71(3): 209-249. [3] BRAY F, FERLAY J, SOERJOMATARAM I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA: A Cancer Journal for Clinicians, 2018, 68(6): 394-424. [4] SIMON K. Colorectal cancer development and advances in screening [J]. Clinical Interventions in Aging, 2016, 11: 967-976. [5] LOEVE F, BOER R, ZAUBER A G, et al. National polyp study data: Evidence for regression of adenomas [J]. International Journal of Cancer, 2004, 111(4): 633-639. [6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [M]//European conference on computer vision. Amsterdam: Springer, 2016: 21-37. [7] BURLING D, International Collaboration for CT Colonography Standards. CT colonography standards [J]. Clinical Radiology, 2010, 65(6): 474-480. [8] COX B F, STEWART F, LAY H, et al. Ultrasound capsule endoscopy: Sounding out the future [J]. Annals of Translational Medicine, 2017, 5(9): 201. [9] SIEGEL R L, MILLER K D, FEDEWA S A, et al. Colorectal cancer statistics, 2017 [J]. CA: A Cancer Journal for Clinicians, 2017, 67(3): 177-193. [10] GUO Z, ZHANG R Y, LI Q, et al. Reduce falsepositive rate by active learning for automatic polyp detection in colonoscopy videos [C]//2020 IEEE 17th International Symposium on Biomedical Imaging. Iowa City: IEEE, 2020: 1655-1658. [11] NOGUEIRA-RODR´IGUEZ A, DOM´INGUEZCARBAJALES R, CAMPOS-TATO F, et al. Real-time polyp detection model using convolutional neural networks [J]. Neural Computing and Applications, 2022, 34(13): 10375-10396. [12] SONG E M, PARK B, HA C A, et al. Endoscopic diagnosis and treatment planning for colorectal polyps using a deep-learning model [J]. Scientific Reports, 2020, 10: 30. [13] XU JW, ZHAO R, YU Y Z, et al. Real-time automatic polyp detection in colonoscopy using feature enhancement module and spatiotemporal similarity correlation unit [J]. Biomedical Signal Processing and Control, 2021, 66: 102503. [14] CAO C T, WANG R L, YU Y, et al. Gastric polyp detection in gastroscopic images using deep neural network [J]. PLoS One, 2021, 16(4): e0250632. [15] CHEN B L, WAN J J, CHEN T Y, et al. A selfattention based faster R-CNN for polyp detection from colonoscopy images [J]. Biomedical Signal Processing and Control, 2021, 70: 103019. [16] QIAN Z Q, JING W J, LV Y, et al. Automatic polyp detection by combining conditional generative adversarial network and modified you-only-look-once [J]. IEEE Sensors Journal, 2022, 22(11): 10841-10849. [17] PASCUAL G, LAIZ P, GARC ´ IA A, et al. Timebased self-supervised learning forWireless Capsule Endoscopy [J]. Computers in Biology and Medicine, 2022, 146: 105631. [18] PACAL I, KARABOGA D. A robust real-time deep learning based automatic polyp detection system [J]. Computers in Biology and Medicine, 2021, 134: 104519. [19] PACAL I, KARAMAN A, KARABOGA D, et al. An efficient real-time colonic polyp detection with YOLO algorithms trained by using negative samples and large datasets [J]. Computers in Biology and Medicine, 2022, 141: 105031. [20] WANG C Y, MARK LIAO H Y, WU Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle: IEEE, 2020: 1571-1580. [21] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. [22] LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768. [23] TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [DB/OL]. (2019-05-28). https://arxiv.org/abs/1905.11946 [24] LIU R. Higher accuracy on vision models with EfficientNet-Lite. TensorFlow Blog [EB/OL]. (2020- 03-16). https://blog.tensorflow.org/2020/03/higheraccuracy- on-vision-models-with-efficientnet-lite.html? continueFlag=fc4c98f37325a2fd6989afa002d20bec [25] HE J B, ERFANI S, MA X J, et al. Alpha-IoU: A family of power intersection over union losses for bounding box regression [DB/OL]. (2021-10-26). https://arxiv.org/abs/2110.13675 [26] BOX G E P, COX D R. An analysis of transformations [J]. Journal of the Royal Statistical Society: Series B (Methodological ), 1964, 26(2): 211-243. [27] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [M]//Computer vision– ECCV 2018. Munich: Springer, 2018: 3-19. [28] HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141. [29] WANG Q L, WU B G, ZHU P F, et al. ECA-net: Efficient channel attention for deep convolutional neural networks [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531-11539. [30] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13708-13717. [31] IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift [C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37. New York: ACM, 2015: 448-456. [32] ODAGAWA M. Implementation of real-time computer-aided diagnosis system with quantitative staging and navigation on customizable embedded digital signal processor [D]. Hiroshima: Hiroshima University, 2021 (in Japanese). [33] KRENZER A, BANCK M, MAKOWSKI K, et al. A real-time polyp-detection system with clinical application in colonoscopy using deep convolutional neural networks [J]. Journal of Imaging, 2023, 9(2): 26. [34] BERNAL J, TAJKBAKSH N, SANCHEZ F J, et al. Comparative validation of polyp detection methods in video colonoscopy: Results from the MICCAI 2015 endoscopic vision challenge [J]. IEEE Transactions on Medical Imaging, 2017, 36(6): 1231-1249. [35] MESEJO P, PIZARRO D, ABERGEL A, et al. Computer-aided classification of gastrointestinal lesions in regular colonoscopy [J]. IEEE Transactions on Medical Imaging, 2016, 35(9): 2051-2063. [36] BORGLI H, THAMBAWITA V, SMEDSRUD P H, et al. HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy [J]. Scientific Data, 2020, 7: 283. [37] JHA D, SMEDSRUD P H, RIEGLER M A, et al. Kvasir-SEG: A segmented polyp dataset [C]//International Conference on Multimedia Modeling. Daejeon: Springer, 2020: 451-462. [38] YANG Y J. The future of capsule endoscopy: The role of artificial intelligence and other technical advancements [J]. Clinical Endoscopy, 2020, 53(4): 387-394. [39] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: Scaling cross stage partial network [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 13024-13033. [40] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-ofthe- art for real-time object detectors [DB/OL]. (2022- 07-06). https://arxiv.org/abs/2207.02696 [41] GE Z, LIU S T, WANG F, et al. YOLOX: Exceeding YOLO series in 2021 [DB/OL]. (2021-07-18). https://arxiv.org/abs/2107.08430 [42] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2999-3007. [43] REN S Q, HE K M, GIRSHICK R, et al. Faster RCNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137- 1149. [44] HOWARD A, SANDLERM, CHEN B, et al. Searching for MobileNetV3 [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 1314-1324. [45] ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6848-6856. [46] HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1577-1586. [47] TAN M X, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks [DB/OL]. (2019-05-28). https://arxiv.org/abs/1905.11946 [48] IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size [DB/OL]. (2016- 02-24). https://arxiv.org/abs/1602.07360 [49] JOCHER G, STOKEN A, BOROVEC J, et al. Ultralytics/ yolov5: v5.0-YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations [EB/OL]. (2021-04-11). https://zenodo.org/records/4679653
|
[1] | 薛昂, 姜恩宇, 张文涛, 林顺富, 米阳. 基于窗口自注意力网络与YOLOv5融合的输电线路通道异物检测[J]. 上海交通大学学报, 2025, 59(3): 413-423. |
[2] | 王一力, 李强, 沈俊逸, 杨翊东, 王琦. 多模态舰船图像生成及其关键部位检测[J]. 空天防御, 2025, 8(1): 77-85. |
[3] | 李楚晨, 唐善军, 赵冰青. 一种基于无人机探测图像区块信息的弱小目标检测算法[J]. 空天防御, 2025, 8(1): 41-47. |
[4] | 陈雪, 朱龙宇, 薛钦洋, 王柯翔, 韩志林, 罗楚养. 防空导弹用树脂基复合材料研究进展[J]. 空天防御, 2024, 7(6): 76-95. |
[5] | 刘婧, 郭晓雷, 张欣海, 毛靖军, 吕瑞恒. 空面导弹轻量化空中斜框目标检测算法[J]. 空天防御, 2024, 7(4): 106-113. |
[6] | 张彦军1,4,5,6,7, 王碧云2,3 , 蔡云泽1,4,5,6,7. 基于注意力的多通道网络红外弱小目标检测[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 414-427. |
[7] | 许斌, 郝予琛, 杨海洋, 刘广, 许泉, 李昱霖, 林楠, 华洲. 防空导弹结构轻量化技术的发展与展望[J]. 空天防御, 2024, 7(3): 1-13. |
[8] | 章涛, 张雪瑞, 陈勇, 钟科林, 罗其俊. 面向民机可视导航的场面多尺度目标检测[J]. 上海交通大学学报, 2024, 58(11): 1816-1825. |
[9] | 牛国臣, 孙翔宇, 苑峥岩. 基于双流特征提取的车路协同感知方法[J]. 上海交通大学学报, 2024, 58(11): 1826-1834. |
[10] | 王建园, 陈小彤, 张越, 孙俊格, 石东浩, 陈金宝. 无人机多模态融合的城市目标检测算法[J]. 空天防御, 2024, 7(1): 32-39. |
[11] | 张晓宇, 杜祥润, 张佳梁, 檀盼龙, 杨诗博. 基于Deformable DETR的红外图像目标检测方法研究[J]. 空天防御, 2024, 7(1): 16-23. |
[12] | 武 星, 张庆丰, 王健嘉, 姚骏峰, 郭毅可, . 基于多重检测模型融合框架的印刷电路板缺陷检测[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(6): 717-727. |
[13] | 丁晓红, 张横, 沈洪. 高速飞行器结构优化及增材制造研究进展[J]. 空天防御, 2023, 6(2): 1-11. |
[14] | . 基于多视角道路相机的时空关联三维车辆检测与跟踪系统[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(1): 52-60. |
[15] | 杨玲玉, 董顺, 洪长青. 新型低密度树脂裂解碳改性碳纤维复合材料的制备与性能研究[J]. 空天防御, 2022, 5(4): 1-9. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||