J Shanghai Jiaotong Univ Sci ›› 2025, Vol. 30 ›› Issue (5): 889-898.doi: 10.1007/s12204-023-2688-6
收稿日期:
2023-07-10
接受日期:
2023-07-31
出版日期:
2025-09-26
发布日期:
2023-12-21
叶继华,江蕗, 肖顺杰, 宗义, 江爱文
Received:
2023-07-10
Accepted:
2023-07-31
Online:
2025-09-26
Published:
2023-12-21
摘要: 目前多标签图像分类的研究主要集中于探索标签之间的相关性,以提高多标签图像的分类精度。但是,现有的方法中,标签相关性是依据数据的统计信息计算的,这种标签相关性是全局且依赖于数据集,并不适合所有样本,并且在提取图像特征过程中,图像中的小物体特性信息易丢失导致小物体的分类准确率低。为此,提出一种基于多尺度融合和自适应标签相关性的多标签图像分类模型,主要思路为:首先将多个尺度的特征图融合以增强小物体的特征信息,并通过标签语义的指导将融合特征图分解为各个类别的特征向量,然后利用图注意力模块中的自注意力机制自适应地挖掘图像中类别之间的相关性,并提出一个注意力正则化损失。该模型在VOC 2007 和 MS COCO 2014 两个公开数据集上的平均精度均值(mAP)分别达到了95.6%和83.6%,并且大部分指标都优于现有的最新方法。
中图分类号:
. 基于多尺度融合和自适应标签相关性的多标签图像分类模型[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 889-898.
YE Jihua, JIANG Lu, XIAO Shunjie, ZONG Yi, JIANG Aiwen. Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 889-898.
[1] WANG Y, HE D L, LI F, et al. Multi-label classification with label graph superimposing [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12265-12272. [2] WANG J, YANG Y, MAO J H, et al. CNN-RNN: A unified framework for multi-label image classification [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2285-2294. [3] WANG Z X, CHEN T S, LI G B, et al. Multi-label image recognition by recurrently discovering attentional regions [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 464-472. [4] CHEN Z M, WEI X S, WANG P, et al. Multi-label image recognition with graph convolutional networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5172-5181. [5] YE J, HE J J, PENG X J, et al. Attention-driven dynamic graph convolutional network for multi-label image recognition[M]//European conference on computer vision. Cham: Springer, 2020: 649-665. [6] CHEN T S, LIN L, CHEN R Q, et al. Knowledge-guided multi-label few-shot learning for general image recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1371-1384. [7] LI Q, PENG X J, QIAO Y, et al. Learning label correlations for multi-label image recognition with graph networks [J]. Pattern Recognition Letters, 2020, 138: 378-384. [8] QI Y H, GUO Y C, CHEN Y S. Multi-label image recognition with asymmetric co- dependency graphs [C]//2021 IEEE 6th International Conference on Big Data Analytics. Xiamen: IEEE, 2021: 287-294. [9] NGUYEN H D, VU X S, LE D T. Modular graph transformer networks for multi-label image classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(10): 9092-9100. [10] OGUZ YAZICI V, GONZALEZ-GARCIA A, RAMISA A, et al. Orderless recurrent models for multi-label classification [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 13437-13446. [11] YOU R C, GUO Z Y, CUI L, et al. Cross-modality attention with semantic graph embedding for multi-label classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12709-12716. [12] VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks [DB/OL]. (2017-10-30). https://arxiv.org/abs/1710.10903 [13] EVERINGHAM M, ALI ESLAMI S M, VAN GOOL L, et al. The pascal visual object classes challenge: A retrospective [J]. International Journal of Computer Vision, 2015, 111(1): 98-136. [14] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[M]//European conference on computer vision. Cham: Springer, 2014: 740-755. [15] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778. [16] ZHU F, LI H S, OUYANG W L, et al. Learning spatial regularization with image-level supervisions for multi-label image classification [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2027-2036. [17] GE W F, YANG S B, YU Y Z. Multi-evidence filtering and fusion for multi-label classification, object detection and semantic segmentation based on weakly supervised learning [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1277-1286. [18] CHEN Z M, WEI X S, WANG P, et al. Learning graph convolutional networks for multi-label recognition and applications [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 6969-6983. [19] YUAN J, CHEN S K, ZHANG Y, et al. Graph attention transformer network for multi-label image classification [J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 19(4): 150. [20] CHEN T S, XU M X, HUI X L, et al. Learning semantic-specific graph representation for multi-label image recognition [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 522-531. [21] ZHAO J W, YAN K, ZHAO Y F, et al. Transformer-based dual relation graph for multi-label image recognition [C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 163-172. [22] YAO X, XU F Y, GU M, et al. M-GCN: Brain-inspired memory graph convolutional network for multi-label image recognition [J]. Neural Computing and Applications, 2022, 34(8): 6489-6502. [23] ZANG L G, LI Y C, CHEN H. Multilabel recognition algorithm with multigraph structure [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(2): 782-792. |
[1] | . MAGPNet: 基于多域注意力引导的红外弱小目标检测网络[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 935-951. |
[2] | 李楚晨, 唐善军, 赵冰青. 一种基于无人机探测图像区块信息的弱小目标检测算法[J]. 空天防御, 2025, 8(1): 41-47. |
[3] | 张彦军1,4,5,6,7, 王碧云2,3 , 蔡云泽1,4,5,6,7. 基于注意力的多通道网络红外弱小目标检测[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(3): 414-427. |
[4] | 冯雨欣, 张冬冬, 厉小润. 基于区域融合的小目标图像盲复原方法[J]. 空天防御, 2023, 6(4): 64-73. |
[5] | 蔡云泽, 张彦军. 基于双通道特征增强集成注意力网络的红外弱小目标检测方法[J]. 空天防御, 2021, 4(4): 14-22. |
[6] | 张燕, 贾振宇, 周顾人, 黄峥嵘, 刘静秋. 基于多方向混合模板的红外弱小目标检测算法[J]. 空天防御, 2019, 2(1): 64-69. |
[7] | 袁利毫, 昝英飞, 钟声华, 祝海涛. 基于YOLOv3的水下小目标自主识别[J]. 海洋工程装备与技术, 2018, 5(增刊): 118-123. |
[8] | 唐峰1, 孙锬锋1, 2, 蒋兴浩1, 2, 陆欢1. 基于改进稀疏编码模型的图像分类算法[J]. 上海交通大学学报(自然版), 2012, 46(09): 1406-1410. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||