Medicine-Engineering Interdisciplinary

Multi-Consistency Training for Semi-Supervised Medical Image Segmentation

Expand
  • Department of Instrument Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, Chi

Received date: 2023-06-19

  Accepted date: 2023-10-24

  Online published: 2025-07-31

Abstract

Medical image segmentation is a crucial task in clinical applications. However, obtaining labeled data for medical images is often challenging. This has led to the appeal of semi-supervised learning (SSL), a technique adept at leveraging a modest amount of labeled data. Nonetheless, most prevailing SSL segmentation methods for medical images either rely on the single consistency training method or directly fine-tune SSL methods designed for natural images. In this paper, we propose an innovative semi-supervised method called multi-consistency training (MCT) for medical image segmentation. Our approach transcends the constraints of prior methodologies by considering consistency from a dual perspective: output consistency across different up-sampling methods and output consistency of the same data within the same network under various perturbations to the intermediate features. We design distinct semi-supervised loss regression methods for these two types of consistencies. To enhance the application of our MCT model, we also develop a dedicated decoder as the core of our neural network. Thorough experiments were conducted on the polyp dataset and the dental dataset, rigorously compared against other SSL methods. Experimental results demonstrate the superiority of our approach, achieving higher segmentation accuracy. Moreover, comprehensive ablation studies and insightful discussion substantiate the efficacy of our approach in navigating the intricacies of medical image segmentation.

Cite this article

Wu Changxue, Zhang Wenxi, Han Jiaozhi, Wang Hongyu . Multi-Consistency Training for Semi-Supervised Medical Image Segmentation[J]. Journal of Shanghai Jiaotong University(Science), 2025 , 30(4) : 800 -814 . DOI: 10.1007/s12204-024-2733-0

References

[1] LI Q, HUANGFU Y, LI J, et al. UConvTrans: A dual-flow cardiac image segmentation network by global and local information integration [J]. Journal of Shanghai Jiao Tong University, 2023, 57(5): 570-581.
[2] ZHANG Y, LIU S J, LI C L, et al. Rethinking the dice loss for deep learning lesion segmentation in medical images [J]. Journal of Shanghai Jiao Tong University (Science), 2021, 26(1): 93-102.
[3] JIANG Z G, CHANG Q. USSL net: Focusing on structural similarity with light U-structure for stroke lesion segmentation [J]. Journal of Shanghai Jiao Tong University (Science), 2022, 27(4): 485-497.
[4] TRAJANOVSKI S, MAVROEIDIS D, SWISHER C L, et al. Towards radiologist-level cancer risk assessment in CT lung screening using deep learning [J]. Computerized Medical Imaging and Graphics, 2021, 90: 101883.
[5] KANG J, DING J M, LEI T, et al. Interactive liver segmentation algorithm based on geodesic distance and V-net [J]. Journal of Shanghai Jiao Tong University (Science), 2022, 27(2): 190-201.
[6] WANG Z M, DONG J J, ZHANG J P. Multi-model ensemble deep learning method to diagnose COVID-19 using chest computed tomography images [J]. Journal of Shanghai Jiao Tong University (Science), 2022, 27(1): 70-80.
[7] BERTHELOT D, CARLINI N, GOODFELLOW I, et al. MixMatch: A holistic approach to semi-supervised learning [C]// 33rd Conference on Neural Information Processing Systems. Vancouver: NIPS, 2019: 1-11.
[8] RASMUS A, BERGLUND M, HONKALA M, et al. Semi-supervised learning with ladder networks [C]// 29th Conference on Neural Information Processing Systems. Vancouver: NIPS, 2015: 1-9. 
[9] LAINE S, AILA T. Temporal ensembling for semi-supervised learning [DB/OL]. (2016-10-07). http://arxiv.org/abs/1610.02242
[10] TARVAINEN A, VALPOLA H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results [C]// 31st Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1-10.
[11] MIYATO T, MAEDA S I, KOYAMA M, et al. Virtual adversarial training: A regularization method for supervised and semi-supervised learning [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(8): 1979-1993.
[12] ZHANG B W, WANG Y D, HOU W X, et al. FlexMatch: Boosting semi-supervised learning with curriculum pseudo labeling [C]// 35th Conference on Neural Information Processing Systems. Online: NIPS, 2021: 18408-18419.
[13] VOLPI R, MORERIO P, SAVARESE S, et al. Adversarial feature augmentation for unsupervised domain adaptation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 5495-5504. 
[14] CHEN T, KORNBLITH S, NOROUZI M, et al. A simple framework for contrastive learning of visual representations [C]// 37th International Conference on Machine Learning. Vienna: PMLR, 2020: 1597-1607.
[15] XIE Q, DAI Z, HOVY E, et al. Unsupervised data augmentation for consistency training [C]// 34th Conference on Neural Information Processing Systems. Vancouver: NIPS, 2020: 6256-6268. 
[16] CAI Z W, RAVICHANDRAN A, MAJI S, et al. Exponential moving average normalization for self-supervised and semi-supervised learning [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Online: IEEE, 2021: 194-203.
[17] LI X M, YU L Q, CHEN H, et al. Transformation-consistent self-ensembling model for semisupervised medical image segmentation [J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(2): 523-534.
[18] XU Z, LU D, LUO J, et al. Anti-interference from noisy labels: Mean-teacher-assisted confident learning for medical image segmentation [J]. IEEE Transactions on Medical Imaging, 2022, 41(11): 3062-3073. 
[19] KE Z H, WANG D Y, YAN Q, et al. Dual student: Breaking the limits of the teacher in semi-supervised learning [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 6727-6735.
[20] SAJJADI M, JAVANMARDI M, TASDIZEN T. Regularization with stochastic transformations and perturbations for deep semi-supervised learning [C]// 30th Conference on Neural Information Processing Systems. Barcelona: NIPS, 2016: 1-9.
[21] BERTHELOT D, CARLINI N, CUBUK E D, et al. ReMixMatch: Semi-supervised learning with distribution alignment and augmentation anchoring [DB/OL]. (2019-11-21). http://arxiv.org/abs/1911.09785
[22] SOHN K, BERTHELOT D, CARLINI N, et al. Fixmatch: Simplifying semi-supervised learning with consistency and confidence [C]// 34th Conference on Neural Information Processing Systems. Vancouver: NIPS, 2020: 596-608. 
[23] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640-651.
[24] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation [M]// Medical image computing and computer-assisted intervention – MICCAI 2015. Cham: Springer, 2015: 234-241.
[25] ZHOU Z, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet++: Redesigning skip connections to exploit multiscale features in image segmentation [J]. IEEE Transactions on Medical Imaging, 2020, 39(6): 1856-1867.
[26] HUANG H M, LIN L F, TONG R F, et al. UNet 3: A full-scale connected UNet for medical image segmentation [C]// 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona: IEEE, 2020: 1055-1059.
[27] OKTAY O, SCHLEMPER J, LE FOLGOC L, et al. Attention U-net: Learning where to look for the pancreas [DB/OL]. (2018-04-11). http://arxiv.org/abs/1804.03999
[28] MILLETARI F, NAVAB N, AHMADI S A. V-Net: Fully convolutional neural networks for volumetric medical image segmentation [C]// 2016 Fourth International Conference on 3D Vision. Stanford: IEEE, 2016: 565-571.
[29] CAO H, WANG Y, CHEN J, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation [M]// Computer vision – ECCV 2022 Workshops. Cham: Springer, 2023: 205-218. [30] LIU Z, LIN Y, CAO Y, et al. Swin transformer: Hierarchical vision transformer using shifted windows [C]// IEEE/CVF International Conference on Computer Vision. Online: 2021: 10012-10022. 
[31] CAO X, CHEN H, LI Y, et al. Uncertainty aware temporal-ensembling model for semi-supervised ABUS mass segmentation [J]. IEEE Transactions on Medical Imaging, 2020, 40(1): 431-443. 
[32] SHI J, GONG T, WANG C, et al. Semi-supervised pixel contrastive learning framework for tissue segmentation in histopathological image [J]. IEEE Journal of Biomedical and Health Informatics, 2022, 27(1): 97-108. 
[33] BAI W, OKTAY O, SINCLAIR M, et al. Semi-supervised learning for network-based cardiac MR image segmentation [M]// Medical image computing and computer-assisted intervention − MICCAI 2017. Cham: Springer, 2017: 253-260. 
[34] OUALI Y, HUDELOT C, TAMI M. Semi-supervised semantic segmentation with cross-consistency training [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 12671-12681.
[35] WANG W H, XIE E Z, LI X, et al. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions [C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 548-558.
[36] WANG W H, XIE E Z, LI X, et al. PVT v2: Improved baselines with pyramid vision transformer [J]. Computational Visual Media, 2022, 8(3): 415-424.
[37] LI Z, WANG W, XIE E, et al. Panoptic SegFormer: Delving deeper into panoptic segmentation with transformers [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 1280-1289.
[38] XIE E, WANG W, YU Z, et al. SegFormer: Simple and efficient design for semantic segmentation with transformers [C]// 35th Conference on Neural Information Processing Systems. Online: NIPS, 2021: 12077-12090. 
[39] HAN G X, MA J W, HUANG S Y, et al. Few-shot object detection with fully cross-transformer [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 5311-5320.
[40] CHENG X L, XIONG H, FAN D P, et al. Implicit motion handling for video camouflaged object detection [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 13854-13863.
[41] WU Z, SU L, HUANG Q M. Cascaded partial decoder for fast and accurate salient object detection [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3902-3911.
[42] ZHANG Y, XIANG T, HOSPEDALES T M, et al. Deep mutual learning [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4320-4328.
[43] WU Y C, GE Z Y, ZHANG D H, et al. Mutual consistency learning for semi-supervised medical image segmentation [J]. Medical Image Analysis, 2022, 81: 102530.
[44] PASZKE A, GROSS S, MASSA F, et al. Pytorch: An imperative style, high-performance deep learning library [C]// 33rd Conference on Neural Information Processing Systems. Vancouver: 2019: 1-12.
[45] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A simple way to prevent neural networks from overfitting [J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958. 
[46] URIA B, CÔTÉ M A, GREGOR K, et al. Neural autoregressive distribution estimation [J]. Journal of Machine Learning Research, 2016, 17(205): 1-37. 
[47] FAN D P, JI G P, ZHOU T, et al. PraNet: parallel reverse attention network for polyp segmentation[M]// Medical image computing and computer assisted intervention – MICCAI 2020. Cham: Springer, 2020: 263-273.
[48] ZHONG Z, ZHENG L, KANG G L, et al. Random erasing data augmentation [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 13001-13008.
[49] GIDARIS S, SINGH P, KOMODAKIS N. Unsupervised representation learning by predicting image rotations [DB/OL]. (2018-03-21). http://arxiv.org/abs/1803.07728
[50] WEI J, WANG S H, HUANG Q M. F³Net: Fusion, feedback and focus for salient object detection [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12321-12328.
[51] MARGOLIN R, ZELNIK-MANOR L, TAL A. How to evaluate foreground maps [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 248-255.
[52] FAN D P, CHENG M M, LIU Y, et al. Structure-measure: A new way to evaluate foreground maps [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 4558-4567.
[53] FAN D P, GONG C, CAO Y, et al. Enhanced-alignment measure for binary foreground map evaluation [C]// Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm: IJCAI, 2018: 698-704.
[54] PANETTA K, RAJENDRAN R, RAMESH A, et al. Tufts dental database: A multimodal panoramic X-ray dataset for benchmarking diagnostic systems [J]. IEEE Journal of Biomedical and Health Informatics, 2022, 26(4): 1650-1659.
[55] ALI QADIR H, BALASINGHAM I, SOLHUSVIK J, et al. Improving automatic polyp detection using CNN by exploiting temporal dependency in colonoscopy video [J]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(1): 180-193.
[56] BERNAL J, SÁNCHEZ F J, FERNÁNDEZ-ESPARRACH G, et al. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J]. Computerized Medical Imaging and Graphics, 2015, 43: 99-111.
[57] JHA D, SMEDSRUD P H, RIEGLER M A, et al. Kvasir-SEG: A segmented polyp dataset [M]// MultiMedia modeling. Cham: Springer, 2020: 451-462. 
[58] LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization [DB/OL]. (2017-11-14). http://arxiv.org/abs/1711.05101
[59] VU T H, JAIN H, BUCHER M, et al. ADVENT: Adversarial entropy minimization for domain adaptation in semantic segmentation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 2512-2521.
[60] YU L Q, WANG S J, LI X M, et al. Uncertainty-aware self-ensembling model for semi-supervised 3D left atrium segmentation [M]// Medical image computing and computer assisted intervention – MICCAI 2019. Cham: Springer, 2019: 605-613.
[61] ZHAO X K, FANG C W, FAN D J, et al. Cross-level contrastive learning and consistency constraint for semi-supervised medical image segmentation [C]//2022 IEEE 19th International Symposium on Biomedical Imaging. Kolkata: IEEE, 2022: 1-5.
[62] TAJBAKHSH N, GURUDU S R, LIANG J M. Automated polyp detection in colonoscopy videos using shape and context information [J]. IEEE Transactions on Medical Imaging, 2016, 35(2): 630-644.
[63] SILVA J, HISTACE A, ROMAIN O, et al. Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer [J]. International Journal of Computer Assisted Radiology and Surgery, 2014, 9: 283-293.
[64] VÁZQUEZ D, BERNAL J, SÁNCHEZ F J, et al. A benchmark for endoluminal scene segmentation of colonoscopy images [J]. Journal of Healthcare Engineering, 2017, 2017: 4037190.
[65] LUO X D, WANG G T, LIAO W J, et al. Semi-supervised medical image segmentation via uncertainty rectified pyramid consistency [J]. Medical Image Analysis, 2022, 80: 102517.
[66] LUO X D, CHEN J N, SONG T, et al. Semi-supervised medical image segmentation through dual-task consistency [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(10): 8801-8809.
Outlines

/