Automation & Computer Science

Person Re-Identification Based on Spatial Feature Learning and Multi-Granularity Feature Fusion

Expand
  • 1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; 2. Robotic Laboratory, South China Robotics Innovation Research Institute, Foshan 528300, Guangdong, China; 3. Foshan Zhiyouren Technology Co., Ltd., Foshan 528300, Guangdong, China; 4. Robotic Laboratory, Institute of Intelligent Manufacturing, Guangdong Academy of Sciences, Guangzhou 510070, China

Accepted date: 2022-12-23

  Online published: 2025-03-21

Abstract

In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks, a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed. First, an attention spatial transformation network (A-STN) is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features. Then the network was divided into a global branch, a local coarse-grained fusion branch, and a local fine-grained fusion branch to extract pedestrian global features, coarse-grained fusion features, and fine-grained fusion features, respectively. Among them, the global branch enriches the global features by fusing different pooling features. The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features. The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features. Finally, the proposed method was compared on three public datasets: Market1501, DukeMTMC-ReID and CUHK03. The experimental results were better than those of the comparative methods, which verifies the effectiveness of the proposed method.

Cite this article

Diao Zijian, Cao Shuai, Li Wenwei, Liang Jianan, Wen Guilin, Huang Weixi, Zhang Shouming . Person Re-Identification Based on Spatial Feature Learning and Multi-Granularity Feature Fusion[J]. Journal of Shanghai Jiaotong University(Science), 2025 , 30(2) : 363 -374 . DOI: 10.1007/s12204-023-2626-7

References

[1] LUO H, JIANG W, FAN X, et al. A survey on deep learning based person re-identification [J]. Acta Automatica Sinica, 2019, 45(11): 2032-2049 (in Chinese).
[2] LIAO S C, HU Y, ZHU X Y, et al. Person reidentification by Local Maximal Occurrence representation and metric learning [C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 2197-2206.
[3] ZHAO R, OUYANGW L, WANG X G. Learning midlevel filters for person re-identification [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 144-151.
[4] SUN Y F, ZHENG L, YANG Y, et al. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline) [M]//Computer vision—ECCV 2018. Cham: Springer, 2018: 501-518.
[5] ZHENG L, HUANG Y J, LU H C, et al. Pose invariant embedding for deep person re-identification [J]. IEEE Transactions on Image Processing, 2019, 28(9): 4500- 4509.
[6] WANG G S, YUAN Y F, CHEN X, et al. Learning discriminative features with multiple granularities for person re-identification [C]//26th ACM international conference on Multimedia. Seoul: ACM, 2018: 274- 282.
[7] ZHENG Z D, ZHENG L, YANG Y. Pedestrian alignment network for large-scale person re-identification [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29(10): 3037-3045.
[8] ZHAO H Y, TIAN M Q, SUN S Y, et al. Spindle net: Person re-identification with human body region guided feature decomposition and fusion [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: 2017: 907-915.
[9] PENG P X, TIAN Y H, HUANG Y R, et al. Discriminative spatial feature learning for person reidentification [C]//Proceedings of the 28th ACM International Conference on Multimedia. Seattle: ACM, 2020: 274-283.
[10] BAI X R, HUI Y, WANG L, et al. Radar-based human gait recognition using dual-channel deep convolutional neural network [J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(12): 9767-9778.
[11] CHAKRABORTY M, KUMAWAT H C, DHAVALE S V, et al. DIAT-RadHARNet: A lightweight DCNN for radar based classification of human suspicious activities [J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-10.
[12] HE KM, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[13] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818- 2826.
[14] CHEN W H, CHEN X T, ZHANG J G, et al. Beyond triplet loss: A deep quadruplet network for person re-identification [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1320-1329.
[15] JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks [C]//Advances in Neural Information Processing Systems. Montreal: NIPS, 2015: 1-9.
[16] HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.
[17] PARK H, HAM B. Relation network for person reidentification [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11839-11847.
[18] ZHENG L, SHEN L Y, TIAN L, et al. Scalable person re-identification: A benchmark [C]//2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1116-1124.
[19] RISTANI E, SOLERA F, ZOU R, et al. Performance measures and a data set for multi-target, multi-camera tracking [M]//Computer vision—ECCV 2016 Workshops. Cham: Springer, 2016: 17-35.
[20] LI W, ZHAO R, XIAO T, et al. DeepReID: deep filter pairing neural network for person re-identification [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 152-159.
[21] RISTANI E, TOMASI C. Features for multi-target multi-camera tracking and re-identification [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6036- 6046.
[22] LUO H, GU Y Z, LIAO X Y, et al. Bag of tricks and a strong baseline for deep person re-identification [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Long Beach: IEEE, 2019: 1487-1495.
[23] CHEN F, WANG N, TANG J, et al. A feature disentangling approach for person re-identification via selfsupervised data augmentation [J]. Applied Soft Computing, 2021, 100: 106939.
[24] ZHENG F, DENG C, SUN X, et al. Pyramidal person re-IDentification via multi-loss dynamic training [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 8506-8514.
[25] DAI Z Z, CHEN M Q, GU X D, et al. Batch Drop- Block network for person re-identification and beyond [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 3690-3700.
[26] WANG C, ZHANG Q, HUANG C, et al. Mancs: A multi-task attentional network with curriculum sampling for person re-identification [M]//Computer vision—ECCV 2018. Cham: Springer, 2018: 384-400.
[27] LI W, ZHU X T, GONG S G. Harmonious attention network for person re-identification [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 2285- 2294.
[28] ZHENG M, KARANAM S, WU Z Y, et al. Reidentification with consistent attentive Siamese networks [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5728-5737.
[29] LUO H, JIANG W, FAN X, et al. STNReID: Deep convolutional networks with pairwise spatial transformer networks for partial person re-identification [J]. IEEE Transactions on Multimedia, 2020, 22(11): 2905-2913.
[30] QUISPE R, PEDRINI H. Top-DB-net: Top DropBlock for activation enhancement in person re-identification [C]//2020 25th International Conference on Pattern Recognition. Milan: IEEE, 2021: 2980-2987.
[31] ZHU K, GUO H Y, ZHANG S L, et al. AAformer: Auto-aligned transformer for person re-identification [DB/OL]. (2021-04-02). https://arxiv.org/abs/ 2104.00921

[32] TANG Y Z, YANG X, WANG N N, et al. Person reidentification with feature pyramid optimization and gradual background suppression [J]. Neural Networks, 2020, 124: 223-232.
[33] LI Y L, HE J F, ZHANG T Z, et al. Diverse part discovery: Occluded person re-identification with part-aware transformer [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 2897-2906.
[34] WANG Z K, ZHU F, TANG S X, et al. Feature erasing and diffusion network for occluded person re-identification [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 4744-4753.

Outlines

/