Convolutional neural networks (CNNs) have been applied in state-of-the-art visual tracking tasks to
represent the target. However, most existing algorithms treat visual tracking as an object-speciˉc task. Therefore,
the model needs to be retrained for di?erent test video sequences. We propose a branch-activated multi-domain
convolutional neural network (BAMDCNN). In contrast to most existing trackers based on CNNs which require
frequent online training, BAMDCNN only needs o2ine training and online fine-tuning. Speciˉcally, BAMDCNN
exploits category-specific features that are more robust against variations. To allow for learning category-specific
information, we introduce a group algorithm and a branch activation method. Experimental results on challenging
benchmark show that the proposed algorithm outperforms other state-of-the-art methods. What's more, compared
with CNN based trackers, BAMDCNN increases tracking speed.
CHEN Yimin (陈一民), LU Rongrong (陆蓉蓉), ZOU Yibo (邹一波), ZHANG Yanhui (张燕辉)
. Branch-Activated Multi-Domain Convolutional Neural Network for Visual Tracking[J]. Journal of Shanghai Jiaotong University(Science), 2018
, 23(3)
: 360
.
DOI: 10.1007/s12204-018-1951-8
[1] BAI Y C, TANG M. Object tracking via robust multi-task sparse representation [J]. IEEE Signal ProcessingLetters, 2014, 21(8): 909-913.
[2] DALAL N, TRIGGS B. Histograms of oriented gra-dients for human detection [C]// International Con-ference on Computer Vision and Pattern Recognition.San Diego, USA: IEEE, 2005: 886-893.
[3] KALAL Z, MIKOLAJCZYK K, MATAS J. Tracking-learning-detection [J]. IEEE Transactions on PatternAnalysis and Machine Intelligence, 2010, 6(1): 1-14.
[4] NAM H, HAN B. Learning multi-domain convolutionalneural networks for visual tracking [C]//Proceedings ofthe IEEE Conference on Computer Vision and PatternRecognition. Las Vegas, USA: IEEE, 2016: 4293-4302.
[5] WANG N Y, LI S Y, GUPTA A, et al. Transferring richfeature hierarchies for robust visual tracking [EB/OL].(2017-02-22). https://arxiv.org/abs/1501.04587.
[6] MA C, HUANG J B, YANG X K, et al. Hier-archical convolutional features for visual tracking[C]//Proceedings of the IEEE International Confer-ence on Computer Vision. Boston, USA: IEEE, 2015:3074-3082.
[7] WANG L J, OUYANG W L, WANG X G, et al.Visual tracking with fully convolutional networks[C]//Proceedings of the IEEE International Confer-ence on Computer Vision. Boston, USA: IEEE, 2015:3119-3127.
[8] MA C, XU Y, NI B B, et al. When correlation ˉltersmeet convolutional neural networks for visual track-ing [J]. IEEE Signal Processing Letters, 2016, 23(10):1454-1458.
[9] CHEN K, TAO W B. Once for all: A two-°ow convo-lutional neural network for visual tracking [EB/OL].(2017-02-22). https://arxiv.org/abs/1604.07507.
[10] LI H X, LI Y, PORIKLI F. Deeptrack: Learning dis-criminative feature representations online for robustvisual tracking [J]. IEEE Transactions on Image Pro-cessing, 2016, 25(4): 1834-1848.
[11] CHATFIELD K, SIMONYAN K, VEDALDI A, etal. Return of the devil in the details: Delvingdeep into convolutional nets [EB/OL]. (2017-02-22).https://arxiv.org/abs/1405.3531.
[12] JANWE N J, BHOYAR K K. Video key-frame extrac-tion using unsupervised clustering and mutual com-parison [J]. International Journal of Image Processing,2016, 10(2): 73-84.
[13] VEDALDI A, LENC K. MatConvNet: Convolutionalneural networks for MATLAB [C]//Proceedings of the23rd ACM International Conference on Multimedia.Brisbane, Australia: ACM, 2015: 689-692.
[14] KRISTAN M, PFLUGFELDER R, LEONARDIS A,et al. The visual object tracking VOT2013 challenge re-sults [C]//Proceedings of the IEEE International Con-ference on Computer Vision Workshops. Sydney, Aus-tralia: IEEE, 2013: 98-111.
[15] KRIATAN M, PFLUGFELDER R, LEONARDIS A,et al. The visual object tracking VOT2014 challenge re-sults [C]//Proceedings of the IEEE International Con-ference on Computer Vision Workshops. Paris, France:IEEE, 2014: 1-23.
[16] KRISTAN M, MATAS J, LEONARDIS A, et al.The visual object tracking VOT2015 challenge results[C]//Proceedings of the IEEE International Confer-ence on Computer Vision Workshops. Santiago, Chile:IEEE, 2015: 1-23.
[17] WU Y, LIM J W, YANG M H. Object tracking bench-mark [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848.
[18] CHEN D P, YUAN Z J, WU Y, et al. Construct-ing adaptive complex cells for robust visual tracking[C]//Proceedings of the IEEE International Confer-ence on Computer Vision. Sydney, Australia: IEEE,2013: 1113-1120.
[19] HARE S, GOLODETZ S, SAFFARI A, et al. Struck:Structured output tracking with kernels [J]. IEEETransactions on Pattern Analysis and Machine Intel-ligence, 2016, 38(10): 2096-2109.
[20] HE S F, YANG Q X, LAU R W H, et al. Visual track-ing via locality sensitive histograms [C]//Proceedingsof the IEEE Conference on Computer Vision andPattern Recognition. Oregon, Portland: IEEE, 2013:2427-2434.
[21] JIA X, LU H C, YANG M H. Visual tracking viaadaptive structural local sparse appearance model[C]//Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition. RI, USA: IEEE, 2012:1822-1829.
[22] ZHONG W, LU H C, YANG M H. Robust ob-ject tracking via sparsity-based collaborative model[C]//Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition. RI, USA: IEEE, 2012:1838-1845.