时间一致性保持的多任务稀疏深度表达视觉跟踪

doi:10.11896/jsjkx.200800212

计算机科学 ›› 2021, Vol. 48 ›› Issue (6): 110-117.doi: 10.11896/jsjkx.200800212

• 计算机图形学&多媒体 • 上一篇下一篇

时间一致性保持的多任务稀疏深度表达视觉跟踪

郭文¹, 尹童灵¹, 张天柱², 徐常胜³

1 山东工商学院信息与电子工程学院山东烟台264009
2 中国科学技术大学信息科学技术学院合肥230026
3 中国科学院自动化研究所模式识别国家重点实验室北京100190

收稿日期:2020-08-30 修回日期:2020-10-21 出版日期:2021-06-15 发布日期:2021-06-03
通讯作者: 郭文(grewen@126.com)
基金资助:
国家自然科学基金(62072286,61876100,61572296);山东省自然科学基金(ZR2015FL020)

Temporal Consistency Preserving Multi-Mask Sparse Deep Representation for Visual Tracking

GUO Wen¹, YIN Tong-ling¹, ZHANG Tian-zhu², XU Chang-sheng³

1 School of Information and Electronic Engineering,Shandong Technology and Business University, Yantai,Shandong 264009,China
2 School of Information Science and Technology,University of Science and Technology of China,Hefei 230026,China
3 National Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China

Received:2020-08-30 Revised:2020-10-21 Online:2021-06-15 Published:2021-06-03
About author:GUO Wen,born in 1978,Ph.D,associate professor,is a member of China Computer Federation.His main interests include computer vision and multimedia computing.
Supported by:
National Natural Science Foundation of China(62072286,61876100,61572296) and Shandong Provincial Natural Science Foundation of China(ZR2015FL020).

摘要/Abstract

摘要： 建立一个既能充分考虑目标表观表达的判别性、又能在后续的跟踪过程中保持特征的时间一致性的模型,是解决跟踪问题的关键。为了提高跟踪算法的特征表达判别性和解决跟踪过程中的特征时效性退化问题,文中提出了一种时间一致性保持的稀疏深度表达的跟踪方法。首先,利用不同卷积层上的特征有不同的属性来构建多任务的稀疏深度表达学习方法,充分挖掘多源信息的相关性。其次,利用相关帧的残差构建时间一致性约束正则项,以对跟踪过程特征的退化起到补偿作用,提高了跟踪算法特征的时间一致性。大量实验视频的跟踪结果显示,相比当前的主流算法,所提算法在复杂背景、快速运动等情况下具有更好的跟踪效果和稳定性。

关键词: 多任务学习, 深度卷积特征, 时间一致性, 视觉跟踪

Abstract: Building a model that can not only fully consider the discriminability of the object appearance,but also keep the temporal consistency of the features in tracking process is the key to solve the tracking problem.In order to improve the discrimination of feature representation and alleviate the degradation of feature in tracking process,a novel temporal consistency preserving multi-mask sparse deep representation method for visual tracking is proposed in the paper.Firstly,multi-task sparse deep expression learning method is constructed by using different feature attributes of deep convolution features on different layers to fully explore the correlation of multi-source information.Secondly,the temporal consistency constrained regularization term is constructed by the residual of relevant frames,which can compensate for the degradation of tracking process features and improve the temporal consistency of tracking features.Numerous experimental results on Benchmark show that this algorithm has better tracking effectiveness and stability than the current state-of-the-art methods in complex background,fast motion and other situations.

Key words: Deep convolutional feature, Multi-mask learning, Temporal consistency, Visual tracking

中图分类号:

TP311

郭文, 尹童灵, 张天柱, 徐常胜. 时间一致性保持的多任务稀疏深度表达视觉跟踪[J]. 计算机科学, 2021, 48(6): 110-117. https://doi.org/10.11896/jsjkx.200800212

GUO Wen, YIN Tong-ling, ZHANG Tian-zhu, XU Chang-sheng. Temporal Consistency Preserving Multi-Mask Sparse Deep Representation for Visual Tracking[J]. Computer Science, 2021, 48(6): 110-117. https://doi.org/10.11896/jsjkx.200800212

参考文献

[1]WU Y,LIM J,YANG M H.Online object tracking:A benchmark[C]//IEEE Conference on Computer Vision and Pattern Recognition.2013:2411-2418.
[2]SMEULDERS A W,CHU D M,CUCCHIARA R,et al.Visual tracking:an experimental survey[J].IEEE Transaction on Pattern Analysis Machine Intelligence,2014:1442-1468.
[3]BOLME D S,BEVERIDGE J R,DRAPER B A,et al.Visual object tracking using adaptive correlation filters[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Reco-gnition.2010:2544-2550.
[4]DANELLJAN M,HAGER G,KHAN F S,et al.Accurate scale estimation for robust visual tracking[C]//Proceedings of British Machine Vision Conference.2014:1-11.
[5]DANELLJAN M,KHAN F S,FELSBERG M,et al.Adaptivecolor attributes for real-time visual tracking[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2014:1090-1097.
[6]HENRIQUES J F,CASEIRO R,MARTINS P,et al.High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(3):583-596.
[7]DANELLJAN M,HÄGER G,KHAN F S,et al.Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:4310-4318.
[8]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2014:580-587.
[9]DENG J,DONG W,SOCHER R,et al.ImageNet:A large-scale hierarchical image database[C]//Proceedings of IEEE Confe-rence on Computer Vision and Pattern Recognition.2009:248-255.
[10]LI P,WANG D,WANG L,et al.Deep visual tracking:Review and experimental comparison[J].Pattern Recognition,2018,76:323-338.
[11]HONG S,YOU T,KWAK S,et al.Online tracking by learning discriminative saliency map with convolutional neural network[C]//Proceedings of the 32 nd International Conference on Machine Learning.2015:597-606.
[12]WANG L,OUYANG W,WANG X,et al.Visual tracking with fully convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:3119-3127.
[13]BERTINETTO L,VALMADRE J,HENRIQUES J F,et al.Fully-convolutional siamese networks for object tracking[C]//Proceedings of the European Conference on Computer Vision Workshops.2016:850-865.
[14]CUI Z,XIAO S,FENG J,et al.Recurrently target-attendingtracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1449-1458.
[15]ZHANG T,GHANEM B,LIU S,et al.Robust visual tracking via multitask sparse learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2012:2042-2049.
[16]HONG Z,MEI X,PROKHOROV D,et al.Tracking via Robust Multi-task Multi-view Joint Sparse Representation[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:649-656.
[17]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition [J].arXiv:1409.1556,2014.
[18]CHEN X,PAN W,KWOK J,et al.Accelerated gradient method for multi-task sparse learning problem[C]//Proceedings of the IEEE International Conference on Data Mining(ICDM).2009:746-751.
[19]MEI X,LING H.Robust visual tracking using L1 minimization[C]//Proceedings of the International Conference on Computer Vision.2010:353-371
[20]HENRIQUES J F,CASEIRO R,MARTINS P,et al.High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(3):583-596.
[21]BAO C,WU Y,LING H,et al.Real time robust L1 trackerusing accelerated proximal gradient approach[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2012:1830-1837.
[22]ROSS D A,LIM J,LIN R S,et al.Incremental learning for robust visual tracking[J].International Journal of Computer Vision,2008,77(13):125-141.
[23]WU Y,SHEN B,LING H.Online robust image alignment viaiterative convex optimization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2012:1808-1814.
[24]SEVILLALAR L,LEARNED-MILLER E.Distribution fieldsfor tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2012:1910-1917.
[25]ZHANG K,ZHANG L,YANG M H.Real-time compressivetracking[C]//Proceedings of the European Conference on Computer Vision.2012:864-877.
[26]EVERINGHAM M,VAN G L,WILLIAMS C K,The pascal vi-sual object classes(voc) challenge[J].International Journal of Computer Vision,2010,88(2):303-338.

相关文章 15

[1]	杜丽君, 唐玺璐, 周娇, 陈玉兰, 程建. 基于注意力机制和多任务学习的阿尔茨海默症分类 Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning 计算机科学, 2022, 49(6A): 60-65. https://doi.org/10.11896/jsjkx.201200072
[2]	赵凯, 安卫超, 张晓宇, 王彬, 张杉, 相洁. 共享浅层参数多任务学习的脑出血图像分割与分类 Intracerebral Hemorrhage Image Segmentation and Classification Based on Multi-taskLearning of Shared Shallow Parameters 计算机科学, 2022, 49(4): 203-208. https://doi.org/10.11896/jsjkx.201000153
[3]	杨晓宇, 殷康宁, 候少麒, 杜文仪, 殷光强. 基于特征定位与融合的行人重识别算法 Person Re-identification Based on Feature Location and Fusion 计算机科学, 2022, 49(3): 170-178. https://doi.org/10.11896/jsjkx.210100132
[4]	宋龙泽, 万怀宇, 郭晟楠, 林友芳. 面向出租车空载时间预测的多任务时空图卷积网络 Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction 计算机科学, 2021, 48(7): 112-117. https://doi.org/10.11896/jsjkx.201000089
[5]	刘小龙, 韩芳, 王直杰. 基于知识表示的联合问答模型 Joint Question Answering Model Based on Knowledge Representation 计算机科学, 2021, 48(6): 241-245. https://doi.org/10.11896/jsjkx.200600011
[6]	周晓进, 徐陈铭, 阮彤. 面向中文电子病历的多粒度医疗实体识别 Multi-granularity Medical Entity Recognition for Chinese Electronic Medical Records 计算机科学, 2021, 48(4): 237-242. https://doi.org/10.11896/jsjkx.200100036
[7]	张春云, 曲浩, 崔超然, 孙皓亮, 尹义龙. 基于过程监督的序列多任务法律判决预测方法 Process Supervision Based Sequence Multi-task Method for Legal Judgement Prediction 计算机科学, 2021, 48(3): 227-232. https://doi.org/10.11896/jsjkx.200700056
[8]	王体爽, 李培峰, 朱巧明. 基于数据增强的中文隐式篇章关系识别方法 Chinese Implicit Discourse Relation Recognition Based on Data Augmentation 计算机科学, 2021, 48(10): 85-90. https://doi.org/10.11896/jsjkx.200800115
[9]	潘祖江, 刘宁, 张伟, 王建勇. 基于层次注意力机制的多任务疾病进展模型 MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism 计算机科学, 2020, 47(9): 185-189. https://doi.org/10.11896/jsjkx.190900001
[10]	周子钦, 严华. 基于多任务学习的有限样本多视角三维形状识别算法 3D Shape Recognition Based on Multi-task Learning with Limited Multi-view Data 计算机科学, 2020, 47(4): 125-130. https://doi.org/10.11896/jsjkx.190700163
[11]	耿蕾蕾, 崔超然, 石成, 申朕, 尹义龙, 冯仕红. 基于深度多任务学习的社交图像标签和分组联合推荐 Social Image Tag and Group Joint Recommendation Based on Deep Multi-task Learning 计算机科学, 2020, 47(12): 177-182. https://doi.org/10.11896/jsjkx.191000141
[12]	陈训敏, 叶书函, 詹瑞. 基于多任务学习及由粗到精的卷积神经网络人群计数模型 Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine 计算机科学, 2020, 47(11A): 183-187. https://doi.org/10.11896/jsjkx.200300012
[13]	高利剑,毛启容. 环境辅助的多任务混合声音事件检测方法 Environment-assisted Multi-task Learning for Polyphonic Acoustic Event Detection 计算机科学, 2020, 47(1): 159-164. https://doi.org/10.11896/jsjkx.190200365
[14]	吴良庆, 张栋, 李寿山, 陈瑛. 基于多任务学习的多模态情绪识别方法 Multi-modal Emotion Recognition Approach Based on Multi-task Learning 计算机科学, 2019, 46(11): 284-290. https://doi.org/10.11896/jsjkx.180901665
[15]	贾静平,覃亦华. 基于深度学习的视觉跟踪算法研究综述 Survey on Visual Tracking Algorithms Based on Deep Learning Technologies 计算机科学, 2017, 44(Z6): 19-23. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.004

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

时间一致性保持的多任务稀疏深度表达视觉跟踪

Temporal Consistency Preserving Multi-Mask Sparse Deep Representation for Visual Tracking

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0