Computing & Computer Technologies

MAGPNet: Multi-Domain Attention-Guided Pyramid Network for Infrared Small Object Detection

Expand
  • 1. Department of Automation, Shanghai Jiao Tong University; Key Laboratory of System Control and Information Processing of Ministry of Education, Shanghai 200240, China; 2. National Key Laboratory of Air-based Information Perception and Fusion, Luoyang 471009, Henan, China; 3. Shanghai Institute of Aerospace Control Technology; Infrared Detection Technology R&D Center of China Aerospace Science and Technology Corporation, Shanghai 201109, China

Received date: 2023-03-21

  Accepted date: 2023-07-14

  Online published: 2024-01-05

Abstract

To overcome the obstacles of poor feature extraction and little prior information on the appearance of infrared dim small targets, we propose a multi-domain attention-guided pyramid network (MAGPNet). Specifically, we design three modules to ensure that salient features of small targets can be acquired and retained in the multi-scale feature maps. To improve the adaptability of the network for targets of different sizes, we design a kernel aggregation attention block with a receptive field attention branch and weight the feature maps under different perceptual fields with attention mechanism. Based on the research on human vision system, we further propose an adaptive local contrast measure module to enhance the local features of infrared small targets. With this parameterized component, we can implement the information aggregation of multi-scale contrast saliency maps. Finally, to fully utilize the information within spatial and channel domains in feature maps of different scales, we propose the mixed spatial-channel attention-guided fusion module to achieve high-quality fusion effects while ensuring that the small target features can be preserved at deep layers. Experiments on public datasets demonstrate that our MAGPNet can achieve a better performance over other state-of-the-art methods in terms of the intersection of union, Precision, Recall, and F-measure. In addition, we conduct detailed ablation studies to verify the effectiveness of each component in our network.

Cite this article

DING Leqi, WANG Biyun, YAO Lixiu, CAI Yunze . MAGPNet: Multi-Domain Attention-Guided Pyramid Network for Infrared Small Object Detection[J]. Journal of Shanghai Jiaotong University(Science), 2025 , 30(5) : 935 -951 . DOI: 10.1007/s12204-024-2694-3

References

[1] TEUTSCH M, KRÜGER W. Classification of small boats in infrared images for maritime surveillance [C]//2010 International WaterSide Security Conference. Carrara: IEEE, 2010: 1-7.

[2] LIU F C, GAO C Q, CHEN F, et al. Infrared small and dim target detection with transformer under complex backgrounds [J]. IEEE Transactions on Image Processing, 2023, 32: 5921-5932.

[3] HOU Q Y, WANG Z P, TAN F J, et al. RISTDnet: Robust infrared small target detection network [J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.

[4] HUANG S Q, PENG Z M, WANG Z R, et al. Infrared small target detection by density peaks searching and maximum-gray region growing [J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16(12): 1919-1923.

[5] HAN J H, LIANG K, ZHOU B, et al. Infrared small target detection utilizing the multiscale relative local contrast measure [J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(4): 612-616.

[6] LING F, ZHANG Y, ZHANG J H, et al. Infrared multi-target tracking based on Deep-Sort optimization algorithm [C]//2021 International Conference on Control, Automation and Information Sciences. Xi'an: IEEE, 2021: 1023-1028.

[7] DAI Y M, WU Y Q, ZHOU F, et al. Asymmetric contextual modulation for infrared small target detection [C]//2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 949-958.

[8] WANG H, ZHOU L P, WANG L. Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 8508-8517.

[9] ZHANG M J, ZHANG R, YANG Y X, et al. ISNet: shape matters for infrared small target detection [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 867-876.

[10] BRAUWERS G, FRASINCAR F. A general survey on attention mechanisms in deep learning [J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(4): 3279-3298.

[11] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// 31st Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1-11.

[12] JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks [DB/OL]. (2015-06-05). https://arxiv.org/abs/1506.02025

[13] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023.

[14] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90.

[15] LI X, WANG W H, HU X L, et al. Selective kernel networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 510-519.

[16] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//European conference on computer vision. Cham: Springer, 2018: 3-19.

[17] WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7794-7803.

[18] FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3141-3149.

[19] ZENG M, LI J X, PENG Z. The design of Top-Hat morphological filter and application to infrared target detection [J]. Infrared Physics & Technology, 2006, 48(1): 67-76.

[20] DESHPANDE S D, ER M H, VENKATESWARLU R, et al. Max-mean and max-median filters for detection of small targets [C]// SPIE's International Symposium on Optical Science, Engineering, and Instrumentation. Denver: SPIE, 1999: 74-83.

[21] PHILIP CHEN C L, LI H, WEI Y T, et al. A local contrast method for small infrared target detection [J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(1): 574-581.

[22] WEI Y T, YOU X G, LI H. Multiscale patch-based contrast measure for small infrared target detection [J]. Pattern Recognition, 2016, 58: 216-226.

[23] HAN J H, LIANG K, ZHOU B, et al. Infrared small target detection utilizing the multiscale relative local contrast measure [J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(4): 612-616.

[24] DENG H, SUN X P, LIU M L, et al. Small infrared target detection based on weighted local difference measure [J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(7): 4204-4214.

[25] AGHAZIYARATI S, MORADI S, TALEBI H. Small infrared target detection using absolute average difference weighted by cumulative directional derivatives [J]. Infrared Physics & Technology, 2019, 101: 78-87.

[26] DAI Y M, WU Y Q, ZHOU F, et al. Attentional local contrast networks for infrared small target detection [J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(11): 9813-9824.

[27] LI B Y, XIAO C, WANG L G, et al. Dense nested attention network for infrared small target detection [J]. IEEE Transactions on Image Processing, 2022, 32: 1745-1758.

[28] WU X, HONG D F, CHANUSSOT J. UIU-net: U-net in U-net for infrared small object detection [J]. IEEE Transactions on Image Processing, 2022, 32: 364-376.

[29] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[M]//Mmedical image computing and computer-assisted intervention – MICCAI 2015. Cham: Springer, 2015: 234-241.

[30] CHEN F, GAO C Q, LIU F C, et al. Local patch network with global attention for infrared small target detection [J]. IEEE Transactions on Aerospace and Electronic Systems, 2022, 58(5): 3979-3991.

[31] ZHANG T F, LI L, CAO S Y, et al. Attention-guided pyramid context networks for detecting infrared small target under complex background [J]. IEEE Transactions on Aerospace and Electronic Systems, 2023, 59(4): 4250-4261.

[32] XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5987-5995.

[33] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions [DB/OL]. (2015-11-23). https://arxiv.org/abs/1511.07122

[34] RAHMAN M A, WANG Y. Optimizing intersection-over-union in deep neural networks for image segmentation[M]// Advances in visual computing. Cham: Springer, 2016: 234-244.

[35] PASZKE A, GROSS S, MASSA F, et al. PyTorch: An imperative style, high-performance deep learning library [DB/OL]. (2019-12-03). https://arxiv.org/abs/1912.01703

[36] KINGMA D P, BA J. Adam: A method for stochastic optimization [DB/OL]. (2014-12-22). https://arxiv.org/abs/1412.6980

[37] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[DB/OL]. (2020-10-22). https://arxiv.org/abs/2010.11929

[38] HE K M, ZHANG X Y, REN S Q, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1026-1034.

[39] ZHANG Y J, WANG B Y, CAI Y Z. Multi-channel based on attention network for infrared small target detection [J]. Journal of Shanghai Jiao Tong University (Science), 2023. https://doi.org/10.1007/s12204-023-2616-9

[40] ZHAO T, WANG T S, CAO Y X, et al. Infrared dim and small target detection and tracking based on single multi-frame algorithm under sea clutter background [C]//2019 Chinese Control Conference. Guangzhou: IEEE, 2019: 7912-7917.

[41] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 618-626.

Outlines

/