J Shanghai Jiaotong Univ Sci ›› 2023, Vol. 28 ›› Issue (3): 360-369.doi: 10.1007/s12204-023-2603-1
姜锐1,朱瑞祥1,蔡萧萃1,苏虎2
收稿日期:
2022-12-27
修回日期:
2023-01-11
接受日期:
2023-05-28
出版日期:
2023-05-28
发布日期:
2023-05-22
JIANG Rui1*(姜﹐锐),ZHU Ruiriang1(朱瑞祥),CAI Xiaocui1(蔡萧萃),SU Hu2(苏虎)
Received:
2022-12-27
Revised:
2023-01-11
Accepted:
2023-05-28
Online:
2023-05-28
Published:
2023-05-22
摘要: 运动物体分割(MOS)是包括医疗机器人在内的所有机器人视觉系统的基本功能之一。基于深度学习的MOS方法,特别是深度端到端MOS方法,在该领域正得到积极研究。前景分割网络(FgSegNets)是最近提出的代表性深度端到端MOS方法。本研究探索了一种新的机制,通过引入相对较少的参数来提高FgSegNets的空间特征学习能力。具体来说,我们提出了增强注意力(EA)模块,它是注意力模块和轻量级增强模块的并行连接,顺序注意力和残差注意力为其特殊情况。还提出将EA与FgSegNet_v2集成,采用轻量级卷积块注意力模块作为注意力模块,并在编码器的两个最大池化层之后插入EA。派生的新模型名为FgSegNet_v2_EA。消融研究验证了所提出的EA模块和集成策略的有效性。CDnet2014数据集上的实验结果显示,FgSegNet_v2_EA在场景相关评估和场景无关评估设置下分别比FgSegNet_v2提高了0.08%和14.5%,这表明EA对提高FgSegNet_v2的空间特征学习能力具有积极作用。
中图分类号:
姜锐1,朱瑞祥1,蔡萧萃1,苏虎2. 具有增强注意力的前景分割网络[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(3): 360-369.
JIANG Rui1*(姜﹐锐),ZHU Ruiriang1(朱瑞祥),CAI Xiaocui1(蔡萧萃),SU Hu2(苏虎). Foreground Segmentation Network with Enhanced Attention[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(3): 360-369.
[1] | MANDAL M, VIPPARTHI S K. An empirical review of deep learning frameworks for change detection: Model design, experimental frameworks, challenges and research needs [J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(7): 6101-6122. |
[2] | BOUWMANS T, JAVED S, SULTANA M, et al. Deep neural network concepts for background subtraction: A systematic review and comparative evaluation [J]. Neural Networks, 2019, 117: 8-66. |
[3] | RAMAMOORTHY M, BANU U S. Video enhancement for medical and surveillance applications [J]. Current Medical Imaging Reviews, 2017, 13(2): 195-203. |
[4] | CHEN M Q, ZHENG Y F, MUELLER K, et al. Enhancement of organ of interest via background subtraction in cone beam rotational angiocardiogram [C]//2012 9th IEEE International Symposium on Biomedical Imaging. Barcelona: IEEE, 2012: 622-625. |
[5] | JIANG R, ZHU R, SU H, et al. Deep learning-based moving object segmentation: Recent progress and research prospects [J]. Machine Intelligence Research, 2023. http://doi.org/10.1007/s11633-022-1378-4 |
[6] | LIM L A, YALIM KELES H. Foreground segmentation using convolutional neural networks for multiscale feature encoding [J]. Pattern Recognition Letters, 2018, 112: 256-262. |
[7] | LIM L A, KELES H Y. Learning multi-scale features for foreground segmentation [J]. Pattern Analysis and Applications, 2020, 23(3): 1369-1380. |
[8] | WANG Y, JODOIN P M, PORIKLI F, et al. CD-net 2014: An expanded change detection benchmark dataset [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Columbus: IEEE, 2014: 393-400. |
[9] | TEZCAN M O, ISHWAR P, KONRAD J. BSUV-net: A fully-convolutional neural network for background subtraction of unseen videos [C]//2020 IEEE Winter Conference on Applications of Computer Vision. Snowmass: IEEE, 2020: 2763-2772. |
[10] | YANG Y Z, RUAN J H, ZHANG Y Q, et al. STP-Net: A spatial-temporal propagation network for back-ground subtraction [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(4): 2145-2157. |
[11] | ZHANG J, ZHANG X, ZHANG Y Y, et al. Meta-knowledge learning and domain adaptation for unseen background subtraction [J]. IEEE Transactions on Image Processing, 2021, 30: 9058-9068. |
[12] | POSNER M I, PETERSEN S E. The attention system of the human brain [J]. Annual Review of Neuroscience, 1990, 13: 25-42. |
[13] | GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: A survey [J]. Computational Visual Media, 2022, 8(3): 331-368. |
[14] | DE SANTANA CORREIA A, COLOMBINI E L. Attention, please! A survey of neural attention models in deep learning [J]. Artificial Intelligence Review, 2022, 55(8): 6037-6124. |
[15] | PATIL P W, DUDHANE A, MURALA S, et al. Deep adversarial network for scene independent moving object segmentation [J]. IEEE Signal Processing Letters, 2021, 28: 489-493. |
[16] | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [DB/OL]. (2014-09-04). https://arxiv.org/abs/ 1409.1556 |
[17] | AKILAN T, JONATHAN WU Q M, ZHANG W D. Video foreground extraction using multi-view receptive field and encoder–decoder DCNN for traffic and surveillance applications [J]. IEEE Transactions on Vehicular Technology, 2019, 68(10): 9478-9493. |
[18] | AKILAN T, JONATHAN WU Q M. sEnDec: An improved image to image CNN for foreground localization [J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(10): 4435-4443. |
[19] | LIANG D, WEI Z Q, SUN H, et al. Robust cross-scene foreground segmentation in surveillance video [C]//2021 IEEE International Conference on Multimedia and Expo. Shenzhen: IEEE, 2021: 1-6. |
[20] | MANDAL M, DHAR V, MISHRA A, et al. 3DFR: A swift 3D feature reductionist framework for scene independent change detection [J]. IEEE Signal Processing Letters, 2019, 26(12): 1882-1886. |
[21] | MANDAL M, DHAR V, MISHRA A, et al. 3DCD: Scene independent end-to-end spatiotemporal feature learning framework for change detection in unseen videos [J]. IEEE Transactions on Image Processing, 2021, 30: 546-558. |
[22] | AKILAN T, WU Q J, SAFAEI A, et al. A 3D CNN-LSTM-based image-to-image foreground segmentation [J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(3): 959-971. |
[23] | TUNG H, ZHENG C, MAO X S, et al. Multi-lead ECG classification via an information-based attention convolutional neural network [J]. Journal of Shanghai Jiao Tong University (Science), 2022, 27(1): 55-69. |
[24] | HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141. |
[25] | LIU J J, HOU Q B, CHENG M M, et al. Improving convolutional networks with self-calibrated convolutions [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10093-10102. |
[26] | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [M]//Computer vision – ECCV 2018. Cham: Springer, 2018: 3-19. |
[27] | PARK J, WOO S, LEE J Y, et al. BAM: Bottleneck attention module [DB/OL]. (2018-07-17). https://arxiv.org/abs/1807.06514 |
[28] | CHEN Y Y, WANG J Q, ZHU B K, et al. Pixelwise deep sequence learning for moving object detection [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(9): 2567-2579. |
[29] | LIANG D, LIU X Y. Coarse-to-fine foreground segmentation based on Co-occurrence pixel-block and spatio-temporal attention model [C]//2020 25th International Conference on Pattern Recognition. Milan: IEEE, 2021: 3807-3813. |
[30] | LIANG D, KANG B, LIU X Y, et al. Cross-scene foreground segmentation with supervised and unsupervised model communication [J]. Pattern Recognition, 2021, 117: 107995. |
[31] | TANG Y Q, ZHANG X, CHEN D H, et al. Motion-augmented change detection for video surveillance [C]//2021 IEEE 23rd International Workshop on Mul-timedia Signal Processing. Tampere: IEEE, 2021: 1-6. |
[32] | HE K M, ZHANG X Y, REN S Q, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1026-1034. |
[33] | ZENG D D, ZHU M. Background subtraction using multiscale fully convolutional network [J]. IEEE Access, 2018, 6: 16010-16021. |
[34] |
[1] | 祝 楷, 熊柏青, 闫宏伟, 张永安, 李志辉, 李锡武, 刘红伟, 温 凯, 闫丽珍, . 辊道传送速度对大规格铝合金厚板应力分布及演变影响的数值模拟研究[J]. J Shanghai Jiaotong Univ Sci, 2023, 28(2): 255-263. |
[2] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(6): 757-767. |
[3] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(2): 190-201. |
[4] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(2): 240-249. |
[5] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 24-35. |
[6] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 99-111. |
[7] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 121-136. |
[8] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 7-14. |
[9] | . [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 577-586. |
[10] | . [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 587-597. |
[11] | . [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 670-679. |
[12] | SHI Lianxing (石连星), WANG Zhiheng (王志恒), LI Xiaoyong (李小勇) . Novel Data Placement Algorithm for Distributed Storage System Based on Fault-Tolerant Domain[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 463-470. |
[13] | ZHAN Zhu (占竹), ZHANG Wenjun (张文俊), CHEN Xia (陈霞), WANG Jun (汪军) . Objective Evaluation of Fabric Flatness Grade Based on Convolutional Neural Network[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 503-510. |
[14] | LIU Ziwen (刘子文), XIAO Lei (肖雷), BAO Jinsong (鲍劲松), TAO Qingbao (陶清宝) . Bearing Incipient Fault Detection Method Based on Stochastic Resonance with Triple-Well Potential System[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 482-487. |
[15] | MA Qunsheng (马群圣), CEN Xingxing (岑星星), YUAN Junyi (袁骏毅), HOU Xumin (侯旭敏). Word Embedding Bootstrapped Deep Active Learning Method to Information Extraction on Chinese Electronic Medical Record[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 494-502. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||