Computing & Computer Technologies

Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation

Expand
  • School of Computer Information Engineering, Jiangxi Normal University, Nanchang 330022, China

Received date: 2023-07-10

  Accepted date: 2023-07-31

  Online published: 2023-12-21

Abstract

At present, research on multi-label image classification mainly focuses on exploring the correlation between labels to improve the classification accuracy of multi-label images. However, in existing methods, label correlation is calculated based on the statistical information of the data. This label correlation is global and depends on the dataset, not suitable for all samples. In the process of extracting image features, the characteristic information of small objects in the image is easily lost, resulting in a low classification accuracy of small objects. To this end, this paper proposes a multi-label image classification model based on multiscale fusion and adaptive label correlation. The main idea is: first, the feature maps of multiple scales are fused to enhance the feature information of small objects. Semantic guidance decomposes the fusion feature map into feature vectors of each category, then adaptively mines the correlation between categories in the image through the self-attention mechanism of graph attention network, and obtains feature vectors containing category-related information for the final classification. The mean average precision of the model on the two public datasets of VOC 2007 and MS COCO 2014 reached 95.6% and 83.6%, respectively, and most of the indicators are better than those of the existing latest methods.

Cite this article

YE Jihua, JIANG Lu, XIAO Shunjie, ZONG Yi, JIANG Aiwen . Multi-Label Image Classification Model Based on Multiscale Fusion and Adaptive Label Correlation[J]. Journal of Shanghai Jiaotong University(Science), 2025 , 30(5) : 889 -898 . DOI: 10.1007/s12204-023-2688-6

References

[1] WANG Y, HE D L, LI F, et al. Multi-label classification with label graph superimposing [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12265-12272.

[2] WANG J, YANG Y, MAO J H, et al. CNN-RNN: A unified framework for multi-label image classification [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2285-2294.

[3] WANG Z X, CHEN T S, LI G B, et al. Multi-label image recognition by recurrently discovering attentional regions [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 464-472.

[4] CHEN Z M, WEI X S, WANG P, et al. Multi-label image recognition with graph convolutional networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5172-5181.

[5] YE J, HE J J, PENG X J, et al. Attention-driven dynamic graph convolutional network for multi-label image recognition[M]//European conference on computer vision. Cham: Springer, 2020: 649-665.

[6] CHEN T S, LIN L, CHEN R Q, et al. Knowledge-guided multi-label few-shot learning for general image recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(3): 1371-1384.

[7] LI Q, PENG X J, QIAO Y, et al. Learning label correlations for multi-label image recognition with graph networks [J]. Pattern Recognition Letters, 2020, 138: 378-384.

[8] QI Y H, GUO Y C, CHEN Y S. Multi-label image recognition with asymmetric co- dependency graphs [C]//2021 IEEE 6th International Conference on Big Data Analytics. Xiamen: IEEE, 2021: 287-294.

[9] NGUYEN H D, VU X S, LE D T. Modular graph transformer networks for multi-label image classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(10): 9092-9100.

[10] OGUZ YAZICI V, GONZALEZ-GARCIA A, RAMISA A, et al. Orderless recurrent models for multi-label classification [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 13437-13446.

[11] YOU R C, GUO Z Y, CUI L, et al. Cross-modality attention with semantic graph embedding for multi-label classification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12709-12716.

[12] VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks [DB/OL]. (2017-10-30). https://arxiv.org/abs/1710.10903

[13] EVERINGHAM M, ALI ESLAMI S M, VAN GOOL L, et al. The pascal visual object classes challenge: A retrospective [J]. International Journal of Computer Vision, 2015, 111(1): 98-136.

[14] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[M]//European conference on computer vision. Cham: Springer, 2014: 740-755.

[15] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.

[16] ZHU F, LI H S, OUYANG W L, et al. Learning spatial regularization with image-level supervisions for multi-label image classification [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2027-2036.

[17] GE W F, YANG S B, YU Y Z. Multi-evidence filtering and fusion for multi-label classification, object detection and semantic segmentation based on weakly supervised learning [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 1277-1286.

[18] CHEN Z M, WEI X S, WANG P, et al. Learning graph convolutional networks for multi-label recognition and applications [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(6): 6969-6983.

[19] YUAN J, CHEN S K, ZHANG Y, et al. Graph attention transformer network for multi-label image classification [J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 19(4): 150.

[20] CHEN T S, XU M X, HUI X L, et al. Learning semantic-specific graph representation for multi-label image recognition [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 522-531.

[21] ZHAO J W, YAN K, ZHAO Y F, et al. Transformer-based dual relation graph for multi-label image recognition [C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 163-172.

[22] YAO X, XU F Y, GU M, et al. M-GCN: Brain-inspired memory graph convolutional network for multi-label image recognition [J]. Neural Computing and Applications, 2022, 34(8): 6489-6502.

[23] ZANG L G, LI Y C, CHEN H. Multilabel recognition algorithm with multigraph structure [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(2): 782-792.

Outlines

/