基于时空注意力机制的目标跟踪算法

doi:10.11896/jsjkx.200800164

Abstract

Abstract: Object tracking technology is widely used in intelligent monitoring,human-computer interaction,unmanned driving and many other fields.In recent years,many efficient tracking methods are proposed.However,object tracking methods still face great challenges in the complex scenario such as occlusion,illumination variations,background clutter,which leads to tracking failure.To solve the above mentioned problems,in this paper,an effective object tracking algorithm is proposed based on temporal-spatial attention mechanism.Firstly,we utilize the Siamese network architecture to improve the discriminative ability of object features.Then,the improved channel attention module and spatial attention module are introduced into the Siamese network,which imposes different weights on the features of different channels and spatial positions and focuses on the features that are beneficial to object tracking in spatial and channel positions.In addition,an efficient online object template updating mechanism is developed,which combines the features of the first frame and the features of the following frames with high confidence to reduce the risk of the object drift.Finally,the proposed tracking algorithm is tested on OTB2013 and OTB2015 benchmarks.Experimental results show that the performance of the proposed algorithm improves by 6.3% compared with the current mainstream tracking algorithms.

Key words: Attention mechanism, Deep learning, Object tracking, Siamese network, Template update

CLC Number:

TP301.6

CHENG Xu, CUI Yi-ping, SONG Chen, CHEN Bei-jing, ZHENG Yu-hui, SHI Jin-gang. Object Tracking Algorithm Based on Temporal-Spatial Attention Mechanism[J].Computer Science, 2021, 48(4): 123-129.

References

[1]YI W,LIM J,YANG M H.Online object tracking:A benchmark[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2013:2411-2418.
[2]WU Y,LIM J,YANG M H.Object tracking benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1834-1848.
[3]FIAZ M,MAHMOOD A,JAVED S,et al.Handcrafted and deep trackers:Recent visual object tracking approaches and trends[J].ACM Computing Surveys(CSUR),2019,52(2):1-44.
[4]LI P X,WANG D,WANG L J,et al.Deep visual tracking:Review and experimental comparison[J].Pattern Recognition,2018,76:323-338.
[5]BOLME D S,BEVERIDGE J R,DRAPER B A,et al.Visual object tracking using adaptive correlation filters[C]//The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2010:2544-2550.
[6]HENRIQUES J F,CASEIRO R,MARTINS P,et al.Exploiting the circulant structure of tracking-by-detection with kernels[C]//European Conference on Computer Vision(ECCV).2012:702-715.
[7]HENRIQUES J F,CASEIRO R,BATISTA J.High speedtracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(3):583-596.
[8]LI Y,ZHU J.A scale adaptive kernel correlation filter tracker with feature integration[C]//European Conference onCompu-ter Vision(ECCV).2014:254-265.
[9]DANELLJAN M,HÄGER G,KHAN F,et al.Accurate scaleestimation for robust visual tracking[C]//Proceeding of the British Machine Vision Conference(BMVC).2014.
[10]DANELLJAN M,ROBINSON A,KHAN F S,et al.BeyondCorrelation Filters:Learning Continuous Convolution Operators for Visual Tracking[C]//European Conference on Computer Vision(ECCV).2016:472-488.
[11]HUANG B,XU T,LI J,et al.Transfer learning-based discriminative correlation filter for visual tracking[J].Pattern Recognition,2019,100:107157.
[12]VALMADRE J,BERTINETTO L,HENRIQUES J,et al.End-to-end representation learning for correlation filter based tra-cking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:2805-2813.
[13]DANELLJAN M,HAGER G,SHAHBAZ K,et al.Learningspatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).2015:4310-4318.
[14]WANG N,YEUNG D Y.Learning a deep compact image representation for visual tracking[C]//The Annual Conference on Neural Information Processing Systems(NIPS).2013:809-817.
[15]SONG Y,MA C,GONG L,et al.CREST:Convolutional residual learning for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2555-2563.
[16]DANELLJAN M,BHAT G,KHAN F S,et al.ECO:Efficient Convolution Operators for Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:6638-6646.
[17]WANG L,OU Y W,WANG X,et al.Visual tracking with fully convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).2015:3119-3127.
[18]MA C,HUANG J,YANG X,et al.Hierarchical Convolutional Features for Visual Tracking[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).2015:3074-3082.
[19]TAO R,GAVVES E,SMEULDERS A W,et al.Siamese In-stance Search for Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:1420-1429.
[20]LUCA B.Fully-convolutional Siamese networks for object tra-cking[C]//European Conference on Computer Vision(ECCV).2016:850-865.
[21]ZHANG Z,PENG H.Deeper and Wider Siamese Networks for Real-Time Visual Tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:4591-4600.
[22]WANG X,LI C,LUO B,et al.SINT++:Robust visual tra-cking via adversarial positive instance generation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:4864-4873.
[23]LI B,YAN J,WU W,et al.High performance visual tracking with siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:8971-8980.
[24]BERTINETTO L,VALMADRE J,GOLODETZ S,et al.Sta-ple:Complementary Learners for Real-Time Tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:1401-1409.
[25]NAM H,HAN B.Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:4293-4302.
[26]SONG Y,MA C,WU X,et al.VITAL:Visual tracking via adversarial learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:8990-8999.
[27]CHENG X,ZHANG Y,ZHOU L,et al.Visual Tracking viaAuto-Encoder Pair Correlation Filter[J].IEEE Transactions on Industrial Electronics,2020,67(4):3288-3297.
[28]PARK E,BERG A C.Meta-tracker:Fast and Robust OnlineAdaptation for Visual Object Trackers[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:587-604.
[29]FAN H,LING H.Parallel Tracking and Verifying[J].IEEE Transactions on Image Processing,2019,28(8):4130-4144.
[30]LI B,WU W,WANG Q,et al.SiamRPN++:Evolution of Siamese Visual Tracking with Very Deep Networks[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:4282-4291.
[31]WANG Q,ZHANG M,XING J,et al.Do not Lose the Details:Reinforced Representation Learning for High Performance Vi-sual Tracking[C]//International Joint Conference on Artificial Intelligence(IJCAI).2018:985-991.
[32]LUKEZIC A,MATAS J,KRIATAN M.D3S - A Discriminative Single Shot Segmentation Tracker[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:7133-7142.
[33]WANG G,LUO C,SUN X,et al.Tracking by Instance Detection:A Meta-Learning Approach[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:6288-6297.
[34]VOIGTLAENDER P,LUITEN J,PHILIP T H S,et al.Siam R-CNN:Visual Tracking by Re-Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:6578-6588.
[35]DANELLJAN M,BHAT G,KHAN F S,et al.ATOM:Accurate Tracking by Overlap Maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:4660-4669.
[36]CHENG X,ZHANG Y,ZHOU L,et al.Visual Tracking viaAuto-Encoder Pair Correlation Filter[J].IEEE Transactions on Industrial Electronics,2020,67(4):3288-3297.

Related Articles 15

[1]	ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2]	DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[3]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4]	XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[5]	XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[6]	RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[7]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[8]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[9]	CHEN Kun-feng, PAN Zhi-song, WANG Jia-bao, SHI Lei, ZHANG Jin. Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation [J]. Computer Science, 2022, 49(8): 165-171.
[10]	SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[11]	YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[12]	WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[13]	HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[14]	JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[15]	WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Object Tracking Algorithm Based on Temporal-Spatial Attention Mechanism

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0