MAGPNet: 基于多域注意力引导的红外弱小目标检测网络

doi:10.1007/s12204-024-2694-3

摘要/Abstract

摘要： 为了克服对红外暗弱小目标特征提取不足和先验信息缺乏的障碍，提出了一个多域注意力引导金字塔网络（MAGPNet）。具体来说，设计了三个模块，以确保小目标的显著特征能够在多尺度特征图中被获取和保留。为了提高网络对不同尺寸目标的适应性，设计了具有感受野注意力分支的核聚合注意（KAA）块，并使用注意机制在不同感知范围下加权特征图。基于对人类视觉系统的研究，进一步提出了自适应局部对比度测量（ALCM）模块，以增强红外小目标的局部特征。借助这个参数化组件，可以实现多尺度对比度显著性图的信息聚合。最后，为了充分利用不同尺度特征图中的空间和通道域内的信息，提出了混合空间-通道（MSC）注意力引导融合模块，以实现高质量的融合效果，同时确保小目标特征能够在深层保留。公开数据集上的实验证明，提出的MAGPNet在交并比（IoU）、精度、召回率和F-measure等性能指标上优于其他最先进方法。此外，还进行了具体的消融研究，以验证网络中每个组件的有效性。

Abstract: To overcome the obstacles of poor feature extraction and little prior information on the appearance of infrared dim small targets, we propose a multi-domain attention-guided pyramid network (MAGPNet). Specifically, we design three modules to ensure that salient features of small targets can be acquired and retained in the multi-scale feature maps. To improve the adaptability of the network for targets of different sizes, we design a kernel aggregation attention block with a receptive field attention branch and weight the feature maps under different perceptual fields with attention mechanism. Based on the research on human vision system, we further propose an adaptive local contrast measure module to enhance the local features of infrared small targets. With this parameterized component, we can implement the information aggregation of multi-scale contrast saliency maps. Finally, to fully utilize the information within spatial and channel domains in feature maps of different scales, we propose the mixed spatial-channel attention-guided fusion module to achieve high-quality fusion effects while ensuring that the small target features can be preserved at deep layers. Experiments on public datasets demonstrate that our MAGPNet can achieve a better performance over other state-of-the-art methods in terms of the intersection of union, Precision, Recall, and F-measure. In addition, we conduct detailed ablation studies to verify the effectiveness of each component in our network.

中图分类号:

TP391.4

. MAGPNet: 基于多域注意力引导的红外弱小目标检测网络[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 935-951.

DING Leqi, WANG Biyun, YAO Lixiu, CAI Yunze. MAGPNet: Multi-Domain Attention-Guided Pyramid Network for Infrared Small Object Detection[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(5): 935-951.

参考文献

[1] TEUTSCH M, KRÜGER W. Classification of small boats in infrared images for maritime surveillance [C]//2010 International WaterSide Security Conference. Carrara: IEEE, 2010: 1-7.

[2] LIU F C, GAO C Q, CHEN F, et al. Infrared small and dim target detection with transformer under complex backgrounds [J]. IEEE Transactions on Image Processing, 2023, 32: 5921-5932.

[3] HOU Q Y, WANG Z P, TAN F J, et al. RISTDnet: Robust infrared small target detection network [J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5.

[4] HUANG S Q, PENG Z M, WANG Z R, et al. Infrared small target detection by density peaks searching and maximum-gray region growing [J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16(12): 1919-1923.

[5] HAN J H, LIANG K, ZHOU B, et al. Infrared small target detection utilizing the multiscale relative local contrast measure [J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(4): 612-616.

[6] LING F, ZHANG Y, ZHANG J H, et al. Infrared multi-target tracking based on Deep-Sort optimization algorithm [C]//2021 International Conference on Control, Automation and Information Sciences. Xi'an: IEEE, 2021: 1023-1028.

[7] DAI Y M, WU Y Q, ZHOU F, et al. Asymmetric contextual modulation for infrared small target detection [C]//2021 IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 949-958.

[8] WANG H, ZHOU L P, WANG L. Miss detection vs. false alarm: Adversarial learning for small object segmentation in infrared images [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 8508-8517.

[9] ZHANG M J, ZHANG R, YANG Y X, et al. ISNet: shape matters for infrared small target detection [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 867-876.

[10] BRAUWERS G, FRASINCAR F. A general survey on attention mechanisms in deep learning [J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(4): 3279-3298.

[11] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// 31st Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 1-11.

[12] JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks [DB/OL]. (2015-06-05). https://arxiv.org/abs/1506.02025

[13] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023.

[14] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90.

[15] LI X, WANG W H, HU X L, et al. Selective kernel networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 510-519.

[16] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[M]//European conference on computer vision. Cham: Springer, 2018: 3-19.

[17] WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7794-7803.

[18] FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3141-3149.

[19] ZENG M, LI J X, PENG Z. The design of Top-Hat morphological filter and application to infrared target detection [J]. Infrared Physics & Technology, 2006, 48(1): 67-76.

[20] DESHPANDE S D, ER M H, VENKATESWARLU R, et al. Max-mean and max-median filters for detection of small targets [C]// SPIE's International Symposium on Optical Science, Engineering, and Instrumentation. Denver: SPIE, 1999: 74-83.

[21] PHILIP CHEN C L, LI H, WEI Y T, et al. A local contrast method for small infrared target detection [J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(1): 574-581.

[22] WEI Y T, YOU X G, LI H. Multiscale patch-based contrast measure for small infrared target detection [J]. Pattern Recognition, 2016, 58: 216-226.

[23] HAN J H, LIANG K, ZHOU B, et al. Infrared small target detection utilizing the multiscale relative local contrast measure [J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(4): 612-616.

[24] DENG H, SUN X P, LIU M L, et al. Small infrared target detection based on weighted local difference measure [J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(7): 4204-4214.

[25] AGHAZIYARATI S, MORADI S, TALEBI H. Small infrared target detection using absolute average difference weighted by cumulative directional derivatives [J]. Infrared Physics & Technology, 2019, 101: 78-87.

[26] DAI Y M, WU Y Q, ZHOU F, et al. Attentional local contrast networks for infrared small target detection [J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(11): 9813-9824.

[27] LI B Y, XIAO C, WANG L G, et al. Dense nested attention network for infrared small target detection [J]. IEEE Transactions on Image Processing, 2022, 32: 1745-1758.

[28] WU X, HONG D F, CHANUSSOT J. UIU-net: U-net in U-net for infrared small object detection [J]. IEEE Transactions on Image Processing, 2022, 32: 364-376.

[29] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[M]//Mmedical image computing and computer-assisted intervention – MICCAI 2015. Cham: Springer, 2015: 234-241.

[30] CHEN F, GAO C Q, LIU F C, et al. Local patch network with global attention for infrared small target detection [J]. IEEE Transactions on Aerospace and Electronic Systems, 2022, 58(5): 3979-3991.

[31] ZHANG T F, LI L, CAO S Y, et al. Attention-guided pyramid context networks for detecting infrared small target under complex background [J]. IEEE Transactions on Aerospace and Electronic Systems, 2023, 59(4): 4250-4261.

[32] XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5987-5995.

[33] YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions [DB/OL]. (2015-11-23). https://arxiv.org/abs/1511.07122

[34] RAHMAN M A, WANG Y. Optimizing intersection-over-union in deep neural networks for image segmentation[M]// Advances in visual computing. Cham: Springer, 2016: 234-244.

[35] PASZKE A, GROSS S, MASSA F, et al. PyTorch: An imperative style, high-performance deep learning library [DB/OL]. (2019-12-03). https://arxiv.org/abs/1912.01703

[36] KINGMA D P, BA J. Adam: A method for stochastic optimization [DB/OL]. (2014-12-22). https://arxiv.org/abs/1412.6980

[37] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[DB/OL]. (2020-10-22). https://arxiv.org/abs/2010.11929

[38] HE K M, ZHANG X Y, REN S Q, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1026-1034.

[39] ZHANG Y J, WANG B Y, CAI Y Z. Multi-channel based on attention network for infrared small target detection [J]. Journal of Shanghai Jiao Tong University (Science), 2023. https://doi.org/10.1007/s12204-023-2616-9

[40] ZHAO T, WANG T S, CAO Y X, et al. Infrared dim and small target detection and tracking based on single multi-frame algorithm under sea clutter background [C]//2019 Chinese Control Conference. Guangzhou: IEEE, 2019: 7912-7917.

[41] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: Visual explanations from deep networks via gradient-based localization [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 618-626.