融合注意力特征的无锚框视觉目标跟踪方法

doi:10.11896/jsjkx.211000083

Abstract

Abstract: As an important branch in the field of computer vision,object tracking has been widely used in many fields such as intelligent video surveillance,human-computer interaction and autonomous driving.Although object tracking has achieved good development in recent years,tracking in complex environment is still a challenge.Due to problems such as occlusion,object deformation and illumination change,tracking performance will be inaccurate and unstable.In this paper,an effective object tracking method AFTM,is proposed with attention features.Firstly,this paper constructs an adaptively generated attention weight factor group,which implements an efficient adaptive fusion strategy for response map to improve the accuracy of object positioning and bounding box scale calculation in the process of classification and regression.Secondly,aiming at the class imbalance in the data set,the proposed method uses the dynamically scaled cross entropy loss as the loss function of the object positioning network,which can modify the optimization direction of the model and make the tracking performance more stable and reliable.Finally,this paper designs a corresponding learning rate adjustment strategy to stochastically average the weight of a number of models,which can enhance the generalization ability of the model.Experimental results on public data sets show that the proposed method has higher accuracy and more stable tracking performance in complex tracking environment.

Key words: Deep learning, Object tracking, Siamese network, Anchor-free, Attention mechanism

CLC Number:

TP391.41

LI Xuehui, ZHANG Yongjun, SHI Dianxi, XU Huachi, SHI Yanyan. AFTM:Anchor-free Object Tracking Method with Attention Features[J].Computer Science, 2023, 50(1): 138-146.

References

[1]LI X,ZHA Y F,ZHANG T Z,et al.Survey of visual objecttracking algorithms based on deep learning[J].Journal of Image and Graphics,2019,24(12):2057-2080.
[2]LU H C,LI P X,WANG D.Visual Object Tracking:A Survey[J].Pattern Recognition and Artificial Intelligence,2018,31(1):61-76.
[3]BERTINETTO L,VALMADRE J,HENRIQUES J F,et al.Fully-convolutional siamese networks for object tracking[C]//European Conference on Computer Vision.Cham:Springer,2016:850-865.
[4]VALMADRE J,BERTINETTO L,HENRIQUES J,et al.End-to-end representation learning for correlation filter based tra-cking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2805-2813.
[5]LI B,YAN J,WU W,et al.High performance visual tracking with siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8971-8980.
[6]ZHU Z,WANG Q,LI B,et al.Distractor-aware siamese net-works for visual object tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:101-117.
[7]LI B,WU W,WANG Q,et al.Siamrpn++:Evolution of siamese visual tracking with very deepnetworks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4282-4291.
[8]ZHANG Z,PENG H.Deeper and wider siamese networks forreal-time visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4591-4600.
[9]WANG Q,ZHANG L,BERTINETTO L,et al.Fast online object tracking and segmentation:A unifying approach[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:1328-1338.
[10]ZHANG L,GONZALEZ-GARCIA A,WEIJER J,et al.Learning the model update for siamese trackers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:4010-4019.
[11]YU Y,XIONG Y,HUANG W,et al.Deformable siamese attention networks for visual object tracking[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:6728-6737.
[12]CHEN Z,ZHONG B,LI G,et al.Siamese box adaptive network for visual tracking[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:6668-6677.
[13]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in Neural Information Processing Systems,2015,28:91-99.
[14]REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7263-7271.
[15]LAW H,DENG J.Cornernet:Detecting objects as paired keypoints[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:734-750.
[16]DUAN K,BAI S,XIE L,et al.Centernet:Keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6569-6578.
[17]ZHOU X,WANG D,KRÄHENBÜHL P.Objects as points[J].arXiv:1904.07850,2019.
[18]TIAN Z,SHEN C,CHEN H,et al.Fcos:Fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:9627-9636.
[19]REZATOFIGHI H,TSOI N,GWAK J Y,et al.Generalized intersection over union:A metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:658-666.
[20]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[21]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[22]IZMAILOV P,PODOPRIKHIN D,GARIPOV T,et al.Averaging weights leads to wider optima and better generalization[J].arXiv:1803.05407,2018.
[23]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[24]REAL E,SHLENS J,MAZZOCCHI S,et al.Youtube-boun-ding boxes:A large high-precision human-annotated data set for object detection in video[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:5296-5305.
[25]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755.
[26]HUANG L,ZHAO X,HUANG K.Got-10k:A large high-diversity benchmark for generic object tracking in the wild[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,43(5):1562-1577.
[27]FAN H,LIN L,YANG F,et al.Lasot:A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5374-5383.
[28]KRISTAN M,LEONARDIS A,MATAS J,et al.The sixth vi-sual object tracking vot2018 challenge results[C]//Proceedings of the European Conference on Computer Vision(ECCV) Workshops.2018:3-53.
[29]KRISTAN M,MATAS J,LEONARDIS A,et al.The seventhvisual object tracking vot2019 challenge results[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision Workshops.2019:2206-2241.
[30]SUN C,WANG D,LU H,et al.Correlation tracking via joint discrimination and reliability learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:489-497.
[31]BHAT G,JOHNANDER J,DANELLJAN M,et al.Unveilingthe power of deep tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:483-498.
[32]XU T,FENG Z H,WU X J,et al.Learning adaptive discriminative correlation filters viatemporal consistency preserving spatial feature selection for robust visual object tracking[J].IEEE Transactions on Image Processing,2019,28(11):5596-5609.
[33]DANELLJAN M,BHAT G,KHAN F S,et al.Atom:Accurate tracking by overlap maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4660-4669.
[34]WANG G,LUO C,XIONG Z,et al.Spm-tracker:Series-parallel matching for real-time visual object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3643-3652.

Related Articles 15

[1]	CAI Xiao, CEHN Zhihua, SHENG Bin. SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing [J]. Computer Science, 2023, 50(1): 105-113.
[2]	ZHANG Jingyuan, WANG Hongxia, HE Peisong. Multitask Transformer-based Network for Image Splicing Manipulation Detection [J]. Computer Science, 2023, 50(1): 114-122.
[3]	WANG Bin, LIANG Yudong, LIU Zhe, ZHANG Chao, LI Deyu. Study on Unsupervised Image Dehazing and Low-light Image Enhancement Algorithms Based on Luminance Adjustment [J]. Computer Science, 2023, 50(1): 123-130.
[4]	CHEN Yunfang, LU Yangyang, ZHOU Xin, ZHANG Wei. Multi-object Tracking Based on Cross-correlation Attention and Chained Frames [J]. Computer Science, 2023, 50(1): 131-137.
[5]	ZHAO Qian, ZHOU Dongming, YANG Hao, WANG Changchen. Image Deblurring Based on Residual Attention and Multi-feature Fusion [J]. Computer Science, 2023, 50(1): 147-155.
[6]	SUN Kaili, LUO Xudong , Michael Y.LUO. Survey of Applications of Pretrained Language Models [J]. Computer Science, 2023, 50(1): 176-184.
[7]	ZHENG Cheng, MEI Liang, ZHAO Yiyan, ZHANG Suhang. Text Classification Method Based on Bidirectional Attention and Gated Graph Convolutional Networks [J]. Computer Science, 2023, 50(1): 221-228.
[8]	LI Xiaoling, WU Haotian, ZHOU Tao, LU Hui. Password Guessing Model Based on Reinforcement Learning [J]. Computer Science, 2023, 50(1): 334-341.
[9]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[10]	ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[11]	DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[12]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[13]	XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[14]	XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[15]	RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

AFTM:Anchor-free Object Tracking Method with Attention Features

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0