计算机科学 ›› 2023, Vol. 50 ›› Issue (1): 138-146.doi: 10.11896/jsjkx.211000083
李雪辉1, 张拥军1, 史殿习1,2,3, 徐化池1, 史燕燕2
LI Xuehui1, ZHANG Yongjun1, SHI Dianxi1,2,3, XU Huachi1, SHI Yanyan2
摘要: 目标跟踪作为计算机视觉领域的一个重要分支,在智能视频监控、人机交互和自动驾驶等诸多领域具有很高的研究价值。尽管目标跟踪近年来已取得较好的发展,但在复杂跟踪环境下,遮挡、目标形变、光照变化等因素仍会导致跟踪精度下降,跟踪性能不稳定。因此,提出了一种融合注意力特征的无锚框视觉目标跟踪方法(Anchor-Free object Tracking Method,AFTM)。首先,在分类和回归过程中构建自适应生成的注意力权重因子组,实现了一种高效的自适应响应图融合策略,提高了目标定位和边界框尺度计算的准确性;其次,针对数据集中样本类别不均衡的现象,使用可动态缩放的交叉熵损失作为目标定位网络的损失函数,修正模型的优化方向,使跟踪性能更加稳定可靠;最后,设计相应的学习率调整策略,对一定数量的模型进行随机权重平均,增强模型的泛化能力。公开数据集上的实验结果表明,在复杂跟踪环境下,AFTM具有更高的精度和更稳定的跟踪效果。
中图分类号:
[1]LI X,ZHA Y F,ZHANG T Z,et al.Survey of visual objecttracking algorithms based on deep learning[J].Journal of Image and Graphics,2019,24(12):2057-2080. [2]LU H C,LI P X,WANG D.Visual Object Tracking:A Survey[J].Pattern Recognition and Artificial Intelligence,2018,31(1):61-76. [3]BERTINETTO L,VALMADRE J,HENRIQUES J F,et al.Fully-convolutional siamese networks for object tracking[C]//European Conference on Computer Vision.Cham:Springer,2016:850-865. [4]VALMADRE J,BERTINETTO L,HENRIQUES J,et al.End-to-end representation learning for correlation filter based tra-cking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2805-2813. [5]LI B,YAN J,WU W,et al.High performance visual tracking with siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8971-8980. [6]ZHU Z,WANG Q,LI B,et al.Distractor-aware siamese net-works for visual object tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:101-117. [7]LI B,WU W,WANG Q,et al.Siamrpn++:Evolution of siamese visual tracking with very deepnetworks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4282-4291. [8]ZHANG Z,PENG H.Deeper and wider siamese networks forreal-time visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4591-4600. [9]WANG Q,ZHANG L,BERTINETTO L,et al.Fast online object tracking and segmentation:A unifying approach[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:1328-1338. [10]ZHANG L,GONZALEZ-GARCIA A,WEIJER J,et al.Learning the model update for siamese trackers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:4010-4019. [11]YU Y,XIONG Y,HUANG W,et al.Deformable siamese attention networks for visual object tracking[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:6728-6737. [12]CHEN Z,ZHONG B,LI G,et al.Siamese box adaptive network for visual tracking[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:6668-6677. [13]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in Neural Information Processing Systems,2015,28:91-99. [14]REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7263-7271. [15]LAW H,DENG J.Cornernet:Detecting objects as paired keypoints[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:734-750. [16]DUAN K,BAI S,XIE L,et al.Centernet:Keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6569-6578. [17]ZHOU X,WANG D,KRÄHENBÜHL P.Objects as points[J].arXiv:1904.07850,2019. [18]TIAN Z,SHEN C,CHEN H,et al.Fcos:Fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:9627-9636. [19]REZATOFIGHI H,TSOI N,GWAK J Y,et al.Generalized intersection over union:A metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:658-666. [20]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141. [21]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988. [22]IZMAILOV P,PODOPRIKHIN D,GARIPOV T,et al.Averaging weights leads to wider optima and better generalization[J].arXiv:1803.05407,2018. [23]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252. [24]REAL E,SHLENS J,MAZZOCCHI S,et al.Youtube-boun-ding boxes:A large high-precision human-annotated data set for object detection in video[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:5296-5305. [25]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755. [26]HUANG L,ZHAO X,HUANG K.Got-10k:A large high-diversity benchmark for generic object tracking in the wild[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,43(5):1562-1577. [27]FAN H,LIN L,YANG F,et al.Lasot:A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5374-5383. [28]KRISTAN M,LEONARDIS A,MATAS J,et al.The sixth vi-sual object tracking vot2018 challenge results[C]//Proceedings of the European Conference on Computer Vision(ECCV) Workshops.2018:3-53. [29]KRISTAN M,MATAS J,LEONARDIS A,et al.The seventhvisual object tracking vot2019 challenge results[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision Workshops.2019:2206-2241. [30]SUN C,WANG D,LU H,et al.Correlation tracking via joint discrimination and reliability learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:489-497. [31]BHAT G,JOHNANDER J,DANELLJAN M,et al.Unveilingthe power of deep tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:483-498. [32]XU T,FENG Z H,WU X J,et al.Learning adaptive discriminative correlation filters viatemporal consistency preserving spatial feature selection for robust visual object tracking[J].IEEE Transactions on Image Processing,2019,28(11):5596-5609. [33]DANELLJAN M,BHAT G,KHAN F S,et al.Atom:Accurate tracking by overlap maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4660-4669. [34]WANG G,LUO C,XIONG Z,et al.Spm-tracker:Series-parallel matching for real-time visual object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3643-3652. |
[1] | 蔡肖, 陈志华, 盛斌. 基于移位窗口金字塔Transformer的遥感图像目标检测 SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing 计算机科学, 2023, 50(1): 105-113. https://doi.org/10.11896/jsjkx.211100208 |
[2] | 张婧媛, 王宏霞, 何沛松. 基于Transformer的多任务图像拼接篡改检测算法 Multitask Transformer-based Network for Image Splicing Manipulation Detection 计算机科学, 2023, 50(1): 114-122. https://doi.org/10.11896/jsjkx.211100269 |
[3] | 王斌, 梁宇栋, 刘哲, 张超, 李德玉. 亮度自调节的无监督图像去雾与低光图像增强算法研究 Study on Unsupervised Image Dehazing and Low-light Image Enhancement Algorithms Based on Luminance Adjustment 计算机科学, 2023, 50(1): 123-130. https://doi.org/10.11896/jsjkx.211100058 |
[4] | 陈云芳, 陆洋洋, 周鑫, 张伟. 基于互相关注意力的链式帧处理多目标跟踪算法 Multi-object Tracking Based on Cross-correlation Attention and Chained Frames 计算机科学, 2023, 50(1): 131-137. https://doi.org/10.11896/jsjkx.211100097 |
[5] | 赵倩, 周冬明, 杨浩, 王长城. 残差注意力与多特征融合的图像去模糊 Image Deblurring Based on Residual Attention and Multi-feature Fusion 计算机科学, 2023, 50(1): 147-155. https://doi.org/10.11896/jsjkx.211100161 |
[6] | 孙凯丽, 罗旭东, 罗有容. 预训练语言模型的应用综述 Survey of Applications of Pretrained Language Models 计算机科学, 2023, 50(1): 176-184. https://doi.org/10.11896/jsjkx.220800223 |
[7] | 郑诚, 梅亮, 赵伊研, 张苏航. 基于双向注意力机制和门控图卷积网络的文本分类方法 Text Classification Method Based on Bidirectional Attention and Gated Graph Convolutional Networks 计算机科学, 2023, 50(1): 221-228. https://doi.org/10.11896/jsjkx.211100095 |
[8] | 李小玲, 吴昊天, 周涛, 鲁辉. 一种基于强化学习的口令猜解模型 Password Guessing Model Based on Reinforcement Learning 计算机科学, 2023, 50(1): 334-341. https://doi.org/10.11896/jsjkx.211100001 |
[9] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[10] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[11] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[12] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[13] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[14] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[15] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
|