In order to solve the problem of small object size and low detection accuracy under the unmanned aerial vehicle (UAV) platform, the object detection algorithm based on deep aggregation network and high-resolution fusion module is studied. Furthermore, a joint network of object detection and feature extraction is studied to construct a real-time multi-object tracking algorithm. For the problem of object association failure caused by UAV movement, image registration is applied to multi-object tracking and a camera motion discrimination model is proposed to improve the speed of the multi-object tracking algorithm. The simulation results show that the algorithm proposed in this study can improve the accuracy of multi-object tracking under the UAV platform, and effectively solve the problem of association failure caused by UAV movement.
LIU Zengmin (刘增敏), WANG Shentao(王申涛), YAO Lixiu(姚莉秀), CAI Yunze(蔡云泽)
. Online Multi-Object Tracking Under Moving Unmanned Aerial Vehicle Platform Based on Object Detection and Feature Extraction Network[J]. Journal of Shanghai Jiaotong University(Science), 2024
, 29(3)
: 388
-399
.
DOI: 10.1007/s12204-022-2540-4
[1] CIAPARRONE G, LUQUE S′ANCHEZ F, TABIK S, et al. Deep learning in video multi-object tracking: A survey [J]. Neurocomputing, 2020, 381: 61-88.
[2] WANG C D. Key technologies of the real-time processing with low-altitude UAV video [D]. Wuhan: Wuhan University, 2018 (in Chinese).
[3] ZHANG X D, IZQUIERDO E, CHANDRAMOULI K. Dense and small object detection in UAV vision based on cascade network [C]//2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul: IEEE, 2019: 118-126.
[4] CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018:6154-6162.
[5] CHEN C R, ZHANG Y, LV Q X, et al. RRNet: A hybrid detector for object detection in drone-captured images [C]//2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul: IEEE, 2019: 100-108.
[6] WANG W. Research on real-time vehicle detection and tracking algorithm based on UAV [D]. Qinhuangdao: Yanshan University, 2020 (in Chinese).
[7] SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4510-4520.
[8] REDMON J, FARHADI A. YOLOv3: An incremental improvement [DB/OL]. (2018-04-08). https://arxiv.org/abs/1804.02767.
[9] ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: Faster and better learning for bounding box regression [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000.
[10] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]//2017 IEEE International Conference on Computer Vision. Venice: IEEE, 2017:2999-3007.
[11] KIM M, ALLETTO S, RIGAZIO L. Similarity mapping with enhanced Siamese network for multi-object tracking [DB/OL]. (2016-09-28). https://arxiv.org/abs/1609.09156.
[12] WOJKE N, BEWLEY A, PAULUS D. Simple online and realtime tracking with a deep association metric [C]//2017 IEEE International Conference on Image Processing. Beijing: IEEE, 2017: 3645-3649.
[13] BEWLEY A, GE Z Y, OTT L, et al. Simple online and realtime tracking [C]//2016 IEEE International Conference on Image Processing. Phoenix: IEEE, 2016: 3464-3468.
[14] MILAN A, REZATOFIGHI S H, DICK A, et al. Online multi-target tracking using recurrent neural networks [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 4225-4232.
[15] LI S Y, YEUNG D Y. Visual object tracking for unmanned aerial vehicles: A benchmark and new motion models [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2017, 31(1): 4140-4146.
[16] LOWE D G. Distinctive image features from scaleinvariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
[17] PAN S Y, TONG Z H, ZHAO Y Y, et al. Multiobject tracking hierarchically in visual data taken from drones [C]//2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul: IEEE, 2019: 135-143.
[18] BERGMANN P, MEINHARDT T, LEAL-TAIX′E L. Tracking without bells and whistles [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 941-951.
[19] VOIGTLAENDER P, KRAUSE M, OSEP A, et al. MOTS: Multi-object tracking and segmentation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7934-7943.
[20] YU F, WANG D Q, SHELHAMER E, et al. Deep layer aggregation [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt
Lake City: IEEE, 2018: 2403-2412.
[21] TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection [C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9626-9635.
[22] MAHMOUDI N, AHADI S M, RAHMATI M. Multitarget tracking using CNN-based features: CNNMTT [J]. Multimedia Tools and Applications, 2019, 78(6): 7077-7096.
[23] PERNICI F, BARTOLI F, BRUNI M, et al. Memory based online learning of deep representations from video streams [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 2324-2334.
[24] WANG Z D, ZHENG L, LIU Y X, et al. Towards real-time multi-object tracking [M]//Computer vision– ECCV 2020. Cham: Springer, 2020: 107-122.
[25] ZHANG Y F, WANG C Y, WANG X G, et al. FairMOT: On the fairness of detection and re-identification in multiple object tracking [J]. International Journal of Computer Vision, 2021, 129: 3069-3087.
[26] REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: A metric and a loss for bounding box regression [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 658-666.
[27] MUNKRES J. Algorithms for the assignment and transportation problems [J]. Journal of the Society for Industrial and Applied Mathematics, 1957, 5(1): 32-38.
[28] BAY H, TUYTELAARS T, VAN GOOL L. SURF: Speeded up robust features [M]//Computer vision–ECCV 2006. Berlin, Heidelberg: Springer, 2006: 404-417.
[29] CALONDER M, LEPETIT V, STRECHA C, et al. BRIEF: Binary robust independent elementary features [M]//Computer vision – ECCV 2010. Berlin, Heidelberg: Springer, 2010: 778-792.
[30] FISCHLER M, BOLLES R. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography [J]. Communications of the ACM, 1981, 24: 381-395.
[31] ZHOU X, KOLTUN V, KR¨AHENB¨UHL P. Tracking objects as points [M]//Computer vision – ECCV 2020. Cham, Springer, 2020: 474-490.
[32] ZHOU X Y, WANG D Q, KR¨AHENB¨UHL P. Objects as points [DB/OL]. (2019-04-16). https://arxiv.org/abs/1904.07850.