UConvTrans:全局和局部信息交互的双分支心脏图像分割

doi:10.16183/j.cnki.jsjtu.2022.088

上海交通大学学报 ›› 2023, Vol. 57 ›› Issue (5): 570-581.doi: 10.16183/j.cnki.jsjtu.2022.088

所属专题：《上海交通大学学报》2023年“生物医学工程”专题

UConvTrans:全局和局部信息交互的双分支心脏图像分割

李擎¹^,², 皇甫玉彬¹, 李江昀¹^,²(), 杨志方¹, 陈鹏³, 王子涵¹

1.北京科技大学自动化学院,北京 100083
2.北京科技大学工业过程知识自动化教育部重点实验室, 北京 100083
3.中国邮政储蓄银行金融科技创新部,北京 100808

收稿日期:2022-03-31 修回日期:2022-05-19 接受日期:2022-05-24 出版日期:2023-05-28 发布日期:2023-06-02
通讯作者: 李江昀 E-mail:leejy@ustb.edu.cn.
作者简介:李擎(1971-),教授,从事智能控制、智能优化、图像处理研究.
基金资助:
国家自然科学基金(62173029)

UConvTrans:A Dual-Flow Cardiac Image Segmentation Network by Global and Local Information Integration

LI Qing¹^,², HUANGFU Yubin¹, LI Jiangyun¹^,²(), YANG Zhifang¹, CHEN Peng³, WANG Zihan¹

1. School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China
2. Key Laboratory of Knowledge Automation for Industrial Processes of the Ministry of Education, University of Science and Technology Beijing, Beijing 100083, China
3. FINTECH Innovation Division, Postal Savings Bank of China, Beijing 100808, China

Received:2022-03-31 Revised:2022-05-19 Accepted:2022-05-24 Online:2023-05-28 Published:2023-06-02
Contact: LI Jiangyun E-mail:leejy@ustb.edu.cn.

摘要/Abstract

摘要：

心脏核磁共振成像(MRI)具有噪声多、背景和目标区域相似度高、右心室形状不固定、呈月牙形或扁圆形等特点,虽然基于卷积神经网络的U型结构在医学图像分割中表现出色,但由于卷积本身的局部运算特性,提取全局信息特征能力有限,所以很难提升在心脏MRI上的分割精度.针对上述问题,提出一种全局和局部信息交互的双分支网络模型(UConvTrans).首先,利用卷积分支和Transformer分支提取局部特征和建模全局上下文信息,能够保留细节信息并抑制心脏MRI中噪声和背景区域的干扰.其次,设计了融合卷积网络和Transformer结构的模块,该模块将二者提取的特征交互融合,增强了模型表达能力,改善了右心室的分割精度,而且避免了Transformer结构在大规模数据集上预训练,可以灵活调节网络结构.此外,UConvTrans能有效地平衡精度和效率,在MICCAI 2017 ACDC 数据集上进行验证,该模型在模型参数量、计算量仅为U-Net的10%、8%的情况下,平均 Dice系数比U-Net提高了1.13%.最终,在其官方测试集上实现了右心室92.42%、心肌91.64%、左心室95.06%的Dice系数,在心肌及左心室区域取得了到目前为止最好的结果.

关键词: 医学图像分割, 心脏核磁共振图像, 卷积神经网络, Transformer模型, 编码器-解码器

Abstract:

Cardiac magnetic resonance image (MRI) segmentation has the features such as there is a lot of noise, the target areas are indistinguishable from the background, and the shape of the right ventricle is irregular. Although convolution operations are good at extracting local features, the U-shaped convolutional neural networks (CNN) structure hardly models long-distance dependency between pixels and can not achieve ideal segmentation results on cardiac MRI. To solve these problems, UConvTrans is proposed with a dual-flow U-shaped network by global and local information integration. First, the network applies the CNN branch to extract local features and capture global representations by Transformer branch, which retains local detailed features and suppresses the interference of noise and background features in cardiac MRI. Next, the bidirectional fusion module is proposed to fuse the features extracted by CNN and the Transformer with each other, enhancing the feature expression capability and improving the segmentation accuracy of the right ventricle. Besides, the parameters of network can be set flexibly because the transformer structure in the proposed method does not require pre-trained weights. The proposed method also strikes a better balance between precision and efficiency, which is evaluated on the MICCAI 2017 ACDC dataset. The results show that the network outperforms U-Net by 1.13% average dice coefficient while the parameter amount and the floating point operations are only 10% and 8% of the U-Net. Finally, the proposed method achieves a dice coefficient of 92.42% for the right ventricle, 91.64% for the myocardium, and 95.06% for the left ventricle respectively and wins the first place in the myocardium and left ventricle on test set.

Key words: medical image segmentation, cardiac magnetic resonance image (MRI), convolutional neural network (CNN), Transformer models, encoder-decoder

中图分类号:

TP391

李擎, 皇甫玉彬, 李江昀, 杨志方, 陈鹏, 王子涵. UConvTrans:全局和局部信息交互的双分支心脏图像分割[J]. 上海交通大学学报, 2023, 57(5): 570-581.

LI Qing, HUANGFU Yubin, LI Jiangyun, YANG Zhifang, CHEN Peng, WANG Zihan. UConvTrans:A Dual-Flow Cardiac Image Segmentation Network by Global and Local Information Integration[J]. Journal of Shanghai Jiao Tong University, 2023, 57(5): 570-581.

图/表 11

图1

图2

图3

图4

图5

表1

图6

表2

表3

表4

图7

参考文献 29

[1]	CHEN C, QIN C, QIU H, et al. Deep learning for cardiac image segmentation: A review[J]. Frontiers in Cardiovascular Medicine, 2020, 7: 25. doi: 10.3389/fcvm.2020.00025 pmid: 32195270
[2]	刘畅, 林楠, 曹仰杰, 等. Seg-CapNet: 心脏MRI图像分割神经网络模型[J]. 中国图象图形学报, 2021, 26(2): 452-463.
	LIU Chang, LIN Nan, CAO Yangjie, et al. Seg-CapNet: Neural network model for the cardiac MRI segmentation[J]. Journal of Image and Graphics, 2021, 26(2): 452-463.
[3]	李江昀, 赵义凯, 薛卓尔, 等. 深度神经网络模型压缩综述[J]. 工程科学学报, 2019, 41(10): 1229-1239.
	LI Jiangyun, ZHAO Yikai, XUE Zhuoer, et al. A survey of model compression for deep neural networks[J]. Chinese Journal of Engineering, 2019, 41(10): 1229-1239.
[4]	田娟秀, 刘国才, 谷珊珊, 等. 医学图像分析深度学习方法研究与挑战[J]. 自动化学报, 2018, 44(3): 401-424.
	TIAN Juanxiu, LIU Guocai, GU Shanshan, et al. Deep learning in medical image analysis and its challenges[J]. Acta Automatica Sinica, 2018, 44(3): 401-424.
[5]	章云港, 杨剑锋, 易本顺. 低剂量CT图像去噪的改进型残差编解码网络[J]. 上海交通大学学报, 2019, 53(8): 983-989.
	ZHANG Yungang, YANG Jianfeng, YI Benshun. Improved residual encoder-decoder network for low-dose CT image denoising[J]. Journal of Shanghai Jiao Tong University, 2019, 53(8): 983-989.
[6]	郑德重, 杨媛媛, 黄浩哲, 等. 基于距离置信度分数的多模态融合分类网络[J]. 上海交通大学学报, 2022, 56(1): 89-100.
	ZHENG Dezhong, YANG Yuanyuan, HUANG Hao-zhe, et al. Multimodal fusion classification network based on distance confidence score[J]. Journal of Shanghai Jiao Tong University, 2022, 56(1): 89-100.
[7]	RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C] //Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015. Munich, Germany: Springer, 2015: 234-241.
[8]	LI J C, YU Z L, GU Z H, et al. Dilated-inception net: Multi-scale feature aggregation for cardiac right ventricle segmentation[J]. IEEE Transactions on Biomedical Engineering, 2019, 66(12): 3499-3508. doi: 10.1109/TBME.10 URL
[9]	CHENG F, CHEN C, WANG Y, et al. Learning directional feature maps for cardiac MRI segmentation[C] //International Conference on Medical Image Computing and Computer-Assisted Intervention. Lima, Peru: Springer, 2020: 108-117.
[10]	罗恺锴, 王婷, 叶芳芳. 引入注意力机制和多视角融合的脑肿瘤MR图像U-Net分割模型[J]. 中国图象图形学报, 2021, 26(9): 2208-2218.
	LUO Kaikai, WANG Ting, YE Fangfang. U-Net segmentation model of brain tumor MR image based on attention mechanism and multi-view fusion[J]. Journal of Image and Graphics, 2021, 26(9): 2208-2218.
[11]	王瑞豪, 刘哲, 宋余庆. 结合切片上下文信息的多阶段胰腺定位与分割[J]. 电子学报, 2021, 49(4): 706-715. doi: 10.12263/DZXB.20200101
	WANG Ruihao, LIU Zhe, SONG Yuqing. Multi-stage pancreas localization and segmentation combined with slices context information[J]. Acta Electronica Sinica, 2021, 49(4): 706-715. doi: 10.12263/DZXB.20200101
[12]	YU H, ZHA S, HUANGFU Y B, et al. Dual attention U-Net for multi-sequence cardiac MR images segmentation[C]//Myocardial Pathology Segmentation Combining Multi-Sequence CMR Challenge. Lima, Peru: Springer, 2020: 118-127.
[13]	WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7794-7803.
[14]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems. Long Beach, CA, USA: MIT, 2017: 5998.
[15]	DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[C]//International Conference on Learning Representations. Vienna: Springer, 2021: 1-21.
[16]	ZHENG S X, LU J C, ZHAO H S, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN, USA. IEEE, 2021: 6881-6890.
[17]	CHEN J, LU Y, YU Q, et al. TransUNet: Transformers make strong encoders for medical image segmentation[EB/OL]. (2021-02-08) [2021-12-20]. https://arxiv.org/abs/2102.04306.
[18]	李耀仟, 李才子, 刘瑞强, 等. 面向手术器械语义分割的半监督时空Transformer 网络[J]. 软件学报, 2021, 33(4): 1501-1515.
	LI Yaoqian, LI Caizi, LIU Ruiqiang, et al. Semi-supervised spatiotemporal Transformer networks for semantic segmentation of surgical instrument[J]. Journal of Software, 2021, 33(4): 1501-1515.
[19]	CAO H, WANG Y, CHEN J, et al. Swin-Unet: Unet-like pure Transformer for medical image segmentation[EB/OL]. (2021-05-12) [2021-12-20]. https://arxiv.org/abs/2105.05537.
[20]	LIU Z, LIN Y, CAO Y, et al. Swin Transformer: Hierarchical vision Transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Virtual, Online: IEEE, 2021: 10012-10022.
[21]	BERNARD O, LALANDE A, ZOTTI C, et al. Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: Is the problem solved?[J]. IEEE Transactions on Medical Imaging, 2018, 37(11): 2514-2525. doi: 10.1109/TMI.2018.2837502 pmid: 29994302
[22]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778.
[23]	BAUMGARTNER C F, KOCH L M, POLLEFEYS M, et al. An exploration of 2D and 3D deep learning techniques for cardiac MR image segmentation[C]//International Workshop on Statistical Atlases and Computational Models of the Heart. Quebec City, QC, Canada: Springer, 2017: 111-119.
[24]	KHENED M, KOLLERATHU V A, KRISHNAMURTHI G. Fully convolutional multi-scale residual DenseNets for cardiac segmentation and automated cardiac diagnosis using ensemble of classifiers[J]. Medical Image Analysis, 2019, 51: 21-45. doi: S1361-8415(18)30848-X pmid: 30390512
[25]	OKTAY O, SCHLEMPER J, FOLGOC L L, et al. Attention U-Net: Learning where to look for the pancreas[EB/OL]. (2018-05-20) [2021-12-20]. https://arxiv.org/abs/1804.03999.
[26]	ISENSEE F, JAEGER P F, FULL P M, et al. Automatic cardiac disease assessment on cine-MRI via time-series segmentation and domain specific features[C]//International Workshop on Statistical Atlases and Computational Models of the Heart.Quebec City, QC, Canada: Springer, 2017: 120-129.
[27]	SIMANTIRIS G, TZIRITAS G. Cardiac MRI segmentation with a dilated CNN incorporating domain-specific constraints[J]. IEEE Journal of Selected Topics in Signal Processing, 2020, 14(6): 1235-1243. doi: 10.1109/JSTSP.4200690 URL
[28]	GIRUM K B, CRÉHANGE G, LALANDE A. Learning with context feedback loop for robust medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2021, 40(6): 1542-1554. doi: 10.1109/TMI.2021.3060497 pmid: 33606627
[29]	ZOTTI C, LUO Z, HUMBERT O, et al. GridNet with automatic shape prior registration for automatic MRI cardiac segmentation[C]//International Workshop on Statistical Atlases and Computational Models of the Heart. Quebec City, QC, Canada: Springer, 2017: 73-81.

方法	Fuse Trans to Conv	Fuse Conv to Trans	DSC /%
方法	Fuse Trans to Conv	Fuse Conv to Trans	平均	RV	Myo	LV
Only Trans	—	—	83.75	80.75	82.48	88.02
Only Conv	—	—	87.60	86.64	86.17	89.98
Trans+Conv	×	×	88.61	86.70	87.72	91.40
Trans+Conv	×	√	88.76	87.52	87.06	91.69
Trans+Conv	√	×	89.25	87.08	88.31	92.38
Trans+Conv	√	√	89.38	87.12	88.44	92.57

通道数	维度数	DSC/%				参数量×10^-6	计算量×10^-9
通道数	维度数	平均	RV	Myo	LV	参数量×10^-6	计算量×10^-9
32	32	89.38	87.12	88.44	92.57	3.65	5.03
32	64	89.60	88.08	88.30	92.41	10.59	12.74
64	32	88.97	87.49	87.81	91.60	7.39	10.81
64	64	89.30	87.80	88.17	91.92	14.54	18.79

方法	DSC/%				参数量×10^-6	计算量×10^-9
方法	平均	RV	Myo	LV	参数量×10^-6	计算量×10^-9
U-Net^[7]	88.25	86.91	87.17	90.65	34.53	65.55
Attention U-Net^[25]	88.52	86.78	86.93	91.84	37.88	66.62
SwinUNet^[19]	89.26	86.62	88.72	92.44	27.17	6.14
TransUNet^[17]	89.47	87.04	88.51	92.85	105.32	38.57
UConvTrans (C=32,D=32)	89.38	87.12	88.44	92.57	3.65	5.03
UConvTrans (C=32,D=64)	89.60	88.08	88.30	92.41	10.59	12.74

方法	DSC/%
方法	RV	Myo	LV
Isensee等^[26]	92.75	91.35	94.75
Simantiris等^[27]	91.25	89.75	94.75
Girum等^[28]	91.60	90.00	94.20
Zotti等^[29]	90.95	89.40	93.80
UConvTrans (C=32,D=64)	92.42	91.64	95.06

UConvTrans:全局和局部信息交互的双分支心脏图像分割

UConvTrans:A Dual-Flow Cardiac Image Segmentation Network by Global and Local Information Integration

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 29

相关文章 15

编辑推荐

Metrics

本文评价

[1]	詹可, 朱仁传. 一种CNN-LSTM船舶运动极值预报模型[J]. 上海交通大学学报, 2023, 57(8): 963-971.
[2]	狄子琦, 王翔宇, 吴双, 周宇 . 基于Transformer架构的高超声速飞行器轨迹生成与预测算法[J]. 空天防御, 2023, 6(4): 35-41.
[3]	万安平, 杨洁, 缪徐, 陈挺, 左强, 李客. 基于注意力机制与神经网络的热电联产锅炉负荷预测[J]. 上海交通大学学报, 2023, 57(3): 316-325.
[4]	王者蓝, 赵宏杰, 赵凡, 沈晨晨, 吴佳伟. 基于卷积神经网络与滤波融合算法的某惯导系统剩余寿命预测模型建立[J]. 空天防御, 2023, 6(1): 70-77.
[5]	. 基于锥型体素建模和单目相机的鸟瞰图语义分割和体素语义分割[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(1): 100-113.
[6]	曾国治, 魏子清, 岳宝, 丁云霄, 郑春元, 翟晓强. 基于CNN-RNN组合模型的办公建筑能耗预测[J]. 上海交通大学学报, 2022, 56(9): 1256-1261.
[7]	全大英, 陈赟, 唐泽雨, 李世通, 汪晓锋, 金小萍. 基于双通道卷积神经网络的雷达信号识别[J]. 上海交通大学学报, 2022, 56(7): 877-885.
[8]	吴庶宸, 戚宗锋, 李建勋. 基于深度学习的智能全局灵敏度分析[J]. 上海交通大学学报, 2022, 56(7): 840-849.
[9]	赵勇, 苏丹. 基于4种长短时记忆神经网络组合模型的畸形波预报[J]. 上海交通大学学报, 2022, 56(4): 516-522.
[10]	陶海红, 闫莹菲. 一种基于GA-CNN的网络化雷达节点遴选算法[J]. 空天防御, 2022, 5(1): 1-5.
[11]	武光利, 郭振洲, 李雷霆, 王成祥. 融合FCN和LSTM的视频异常事件检测[J]. 上海交通大学学报, 2021, 55(5): 607-614.
[12]	邱忠宇, 赵文龙, 高文, 潘洪涛, 史冉东. 动态视觉传感器的目标检测算法对比分析[J]. 空天防御, 2021, 4(4): 101-106.
[13]	石敏, 蔡少委, 易清明. 基于空洞-稠密网络的交通拥堵预测模型[J]. 上海交通大学学报, 2021, 55(2): 124-130.
[14]	祁生勇, 臧月进, 吕国云, 杜明. 基于生成对抗网络的空中目标图像生成算法研究[J]. 空天防御, 2021, 4(2): 67-.
[15]	薛蓉蓉, 王志武, 颜国正, 庄浩宇. 肠道机器人获取的肠道图像降噪处理方法[J]. 上海交通大学学报, 2021, 55(10): 1303-1309.