J Shanghai Jiaotong Univ Sci ›› 2023, Vol. 28 ›› Issue (4): 441-.doi: 10.1007/s12204-022-2519-1

• Medicine-Engineering Interdisciplinary Research • Previous Articles    

Improving Colonoscopy Polyp Detection Rate Using Semi-Supervised Learning

利用半监督学习提高结肠镜息肉检出率

YAO Leyul (姚乐宇),HE Fan1,3 (何凡), PENG Haixia2* (彭海霞), WANG Xiaofeng2 (王晓峰),ZHOU Lu2 (周璐), HUANG Xiaolin1,3* (黄晓霖)   

  1. (1. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; 2. Tong Ren Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200336, China; 3. Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai 200240, China)
  2. (1.上海交通大学 电子信息与电气工程学院,上海200240;2. 上海交通大学医学院 同仁医院,上海200336;3. 上海交通大学 医疗机器人研究院,上海200240)
  • Received:2021-04-16 Accepted:2021-08-02 Online:2023-07-28 Published:2023-07-31

Abstract: Colorectal cancer is one of the biggest health threats to humans and takes thousands of lives every year.Colonoscopy is the gold standard in clinical practice to inspect the intestinal wall, detect polyps and remove polypsin early stages, preventing polyps from becoming malignant and forming colorectal cancer instances. In recentyears, computer-aided polyp detection systems have been widely used in colonoscopies to improve the qualityof colonoscopy examination and increase the polyp detection rate. Currently, the most efficient computer-aidedsystems are built with machine learning methods. However, developing such a computer-aided detection systemrequires experienced doctors to label a large number of image data from colonoscopy videos, which is extremelytime-consuming, laborious and expensive. One possible solution is to adopt a semi-supervised learning, which canbuild a detection system on a dataset where part of its data is not necessary to be labeled. In this paper, on thebasis of state-of-the-art object detection method and semi-supervised learning technique, we design and implementa semi-supervised colonoscopy polyp detection system containing four main steps: running standard supervisedtraining with all labeled data; running inference on unlabeled data to obtain pseudo labels; applying a set ofstrong augmentation to both unlabeled data and pseudo label; combining labeled data, and unlabeled data withits pseudo labels to retrain the detector. The semi-supervised learning system is evaluated both on public datasetand our original private dataset and proves its effectiveness. Also, the inference speed of the semi-supervisedlearning system can meet the requirement of real-time operation.

Key words: semi-supervised learning, colonoscopy polyp detection, medical image analysis

摘要: 结直肠癌是人类最大的健康威胁之一,每年夺去数千人的生命。结肠镜检查是临床实践中检查肠壁、在早期发现并切除息肉、防止息肉恶性化和发展成为癌症的金标准。近年来,计算机辅助息肉检测系统被广泛应用于结肠镜检查中以提高结肠镜检查的质量,提高息肉的检出率。目前,最有效的计算机辅助系统是用机器学习方法进行构建的。然而,开发这样的计算机辅助检测系统需要有经验的医生从结肠镜检查视频中标记大量图像数据,这一过程极其耗时、费力且十分昂贵。一种可能的解决方案是采用半监督学习,它可以在不必标注所有数据的数据集上建立检测系统。在本文中,基于最先进的对象检测方法和半监督学习技术,我们设计并实现了一个半监督结肠镜息肉检测系统,该系统的实现包括四个主要步骤:(1)使用所有标记数据进行标准监督训练;(2) 对未标记的数据进行推理以获得伪标签;(3) 将一系列强数据增强方法应用于未标记数据和伪标记;(4) 将标记的数据、未标记的数据与其伪标签组合以重新训练检测器。我们在公开数据集和原始私有数据集上对半监督学习系统进行了评估,并证明了其有效性。此外,半监督学习系统的推理速度也可以满足实时运转的要求。

关键词: 半监督学习,结肠镜息肉检测,医学图像分析

CLC Number: