基于高秩特征和位置注意力的RGBT目标跟踪

doi:10.11896/jsjkx.220600037

摘要/Abstract

摘要： RGBT目标跟踪利用可见光(RGB)与热红外(T)两种不同模态的优势来解决单一模态目标跟踪中常见的模态受限问题,以此提升复杂环境下的目标跟踪性能。在RGBT目标跟踪算法中,精准定位目标位置和有效融合两种模态都是非常重要的问题。为了达到精准定位目标以及有效融合两种模态的目的,提出了一种探索高秩的特征图以及引入位置注意力来进行RGBT目标跟踪的新方法。该方法首先根据主干网络的深层与浅层的特征,使用位置注意力来关注目标的位置信息,接着通过探索两种模态融合前的高秩特征图,关注特征的重要性,以指导模态特征融合。为了关注目标位置信息,在行和列上使用平均池化操作。对于高秩特征指导模块,文中根据特征图的秩来指导特征图的融合。并且,为了去除冗余和噪声,实现更加鲁棒的特征表达,直接删除了秩小的特征图。在两个RGBT跟踪基准数据集上的实验结果表明,与其他RGBT目标跟踪方法相比,所提方法在准确度和成功率上取得了更好的跟踪结果。

关键词: RGBT目标跟踪, 高秩特征图, 目标位置信息

Abstract: RGBT target tracking uses the advantages of two different modes of visible light(RGB) and thermal infrared(T) to solve the common modal limitation problem in single mode target tracking,so as to improve the performance of target tracking in complex environment.In the RGBT object tracking algorithm,the precise location of the object and the effective fusion of the two modalities are very important issues.In order to accurately locate the object and effectively fuse the two modalities,this paper proposes a new method to explore high-rank feature maps and introduce position attention for RGBT object tracking.The method first uses location attention to focus on the location information of the object according to the deep and shallow features of the backbone network,and then focuses on the importance of the features by exploring the high-rank feature maps before the fusion of the two modalities to guide the modal features fusion.In order to focus on the object location information,this paper uses the average pooling operation on the rows and columns.For the high-rank feature guidance module,this paper guides the fusion of feature maps according to the rank of the feature maps.In order to remove redundancy and noise and achieve more robust feature expression,the feature graph with small rank is deleted directly.Experimental results on two RGBT tracking benchmark data sets show that compared with other RGBT target tracking methods,the proposed method achieves better tracking results in accuracy and success rate.

Key words: RGBT object tracking, High rank feature, Object location information

中图分类号:

TP18

杨岚岚, 王文琪, 王福田. 基于高秩特征和位置注意力的RGBT目标跟踪[J]. 计算机科学, 2022, 49(12): 236-243. https://doi.org/10.11896/jsjkx.220600037

YANG Lan-lan, WANG Wen-qi, WANG Fu-tian. RGBT Object Tracking Based on High Rank Feature and Position Attention[J]. Computer Science, 2022, 49(12): 236-243. https://doi.org/10.11896/jsjkx.220600037

参考文献

[1]JI H X Y,LIANG P P,CHAI Y M,et al.Planar Object Tra-cking Algorithm Based on Key Points and Optical Flow[J].Computer Engineering,2021,47(4):234-240.
[2]DANELLJAN M,BHAT G,KHAN F S,et al.Atom:Accurate tracking by overlap maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4660-4669.
[3]WANG Q,ZHANG L,BERTINETTO L,et al.Fast online object tracking and segmentation:A unifying approach[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:1328-1338.
[4]WU Y,LIM J,YANG M H.Online object tracking:A benchmark[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2013:2411-2418.
[5]VALMADRE J,BERTINETTO L,HENRIQUES J F,et al.Long-term tracking in the wild:A benchmark[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:670-685.
[6]MARTIN D,GUSTAV H,FAHAD K,et al.Learning Spatially Regularized Correlation Filters for Visual Tracking[C]//ICCV.2015.
[7]MARTIN D,BHAT G,SHAHBAZ KHAN F,et al.Eco:Effi-cient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:6638-6646.
[8]LI F,TIAN C,ZUO W,et al.Learning spatial-temporal regularized correlation filters for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4904-4913.
[9]LI G,YU Y.Deep contrast learning for salient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:478-487.
[10]WU A,ZHENG W S,YU H X,et al.RGB-infrared cross-moda-lity person re-identification[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:5380-5389.
[11]BADRINARAYANAN V,KENDALL A,SEGNET R C.Adeep convolutional encoder-decoder architecture for image segmentation[J].arXiv:1511.00561,2015.
[12]LI C,XIA W,YAN Y,et al.Segmenting objects in day andnight:Edge-conditioned cnn for thermal image semantic segmentation[J].IEEE Transactions on Neural Networks and Learning Systems,2020,32(7):3069-3082.
[13]XU D,OUYANG W,RICCI E,et al.Learning cross-modal deep representations for robust pedestrian detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5363-5371.
[14]GADE R,MOESLUND T B.Thermal cameras and applications:a survey[J].Machine Vision and Applications,2014,25(1):245-262.
[15]LI C,ZHAO N,LU Y,et al.Weighted sparse representationregularized graph learning for RGB-T object tracking[C]//Proceedings of the 25th ACM International Conference on Multimedia.2017:1856-1864.
[16]LAN X,YE M,ZHANG S,et al.Robust collaborative discriminative learning for RGB-infrared tracking[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:7008-7015.
[17]WANG Y,LI C,TANG J.Learning soft-consistent correlation filters for RGB-T object tracking[C]//Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Cham:Springer,2018:295-306.
[18]LI C,ZHU C,HUANG Y,et al.Cross-modal ranking with soft consistency and noisy labels for robust RGB-T tracking[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:808-823.
[19]ZHANG L,DANELLJAN M,GONZALEZ-GARCIA A,et al.Multi-modal fusion for end-to-end rgb-t tracking[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision Workshops.2019:2252-2261.
[20]ZHANG H,ZHANG L,ZHUO L,et al.Object tracking inRGB-T videos using modal-aware attention network and competitive learning[J].Sensors,2020,20(2):393-411.
[21]WANG C,XU C,CUI Z,et al.Cross-modal pattern-propagation for RGB-T tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:7064-7073.
[22]ZHANG P,ZHAO J,BO C,et al.Jointly modeling motion and appearance cues for robust RGB-T tracking[J].IEEE Transactions on Image Processing,2021,30:3335-3347.
[23]LI C,LIU L,LU A,et al.Challenge-aware RGBT tracking[C]//European Conference on Computer Vision.Cham:Sprin-ger,2020:222-237.
[24]LIN M,JI R,WANG Y,et al.Hrank:Filter pruning using high-rank feature map[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:1529-1538.
[25]HOU Q,ZHOU D,FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13713-13722.
[26]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[27]HU J,SHEN L,ALBANIE S,et al.Squeeze-and-Excitation Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(8):2011-2023.
[28]ZHU Y,LI C,TANG J,et al.Quality-aware feature aggregation network for robust RGBT tracking[J].IEEE Transactions on Intelligent Vehicles,2020,6(1):121-130.
[29]TANG Z,XU T,LI H,et al.Exploring Fusion Strategies for Accurate RGBT Visual Object Tracking[J].arXiv:2201.08673,2022.
[30]CHATFIELD K,SIMONYAN K,VEDALDI A,et al.Return of the devil in the details:Delving deep into convolutional nets[J].arXiv:1405.3531,2014.
[31]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[32]BOTTOU L.Large-scale machine learning with stochastic gra-dient descent[C]//Proceedings of COMPSTAT.2010:177-186.
[33]NAM H,HAN B.Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:4293-4302.
[34]LI C,CHENG H,HU S,et al.Learning collaborative sparse representation for grayscale-thermal tracking[J].IEEE Transa-ctions on Image Processing,2016,25(12):5743-5756.
[35]LI C,LIANG X,LU Y,et al.RGB-T object tracking:Benchmark and baseline[J].Pattern Recognition,2019,96:106977.
[36]PU S,SONG Y,MA C,et al.Deep attentive tracking via reciprocative learning[J].Advances in Neural Information Processing Systems,2018,31:1935-1945.
[37]GAO Y,LI C,ZHU Y,et al.Deep adaptive fusion network for high performance RGBT tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.2019:91-99.
[38]ZHU Y,LI C,LUO B,et al.Dense feature aggregation andpruning for rgbt tracking[C]//Proceedings of the 27th ACM International Conference on Multimedia.2019:465-472.
[39]JUNG I,SON J,BAEK M,et al.Real-time mdnet[C]//Procee-dings of the European Conference on Computer Vision(ECCV).2018:83-98.
[40]ZHANG Z,PENG H.Deeper and wider siamese networks for real-time visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4591-4600.
[41]DANELLJAN M,BHAT G,SHAHBAZ KHAN F,et al.Eco:Efficient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:6638-6646.
[42]LI C,ZHAO N,LU Y,et al.Weighted sparse representation regularized graph learning for RGB-T object tracking[C]//Proceedings of the 25th ACM International Conference on Multimedia.2017:1856-1864.
[43]LI C L,LU A,ZHENG A H,et al.Multi-adapter RGBT tra-cking[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).IEEE,2019:2262-2270.
[44]HARE S,GOLODETZ S,SAFFARI A,et al.Struck:Structured output tracking with kernels[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,38(10):2096-2109.
[45]VALMADRE J,BERTINETTO L,HENRIQUES J,et al.End-to-end representation learning for correlation filter based tra-cking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2805-2813.
[46]HENRIQUES J F,CASEIRO R,MARTINS P,et al.High-speed tracking with kernelized correlation filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,37(3):583-596.
[47]WU Y,BLASCH E,CHEN G,et al.Multiple source data fusion via sparse representation for robust visual tracking[C]//14th International Conference on Information Fusion.IEEE,2011:1-8.
[48]LIU H P,SUN F C.Fusion tracking in color and infrared images using joint sparse representation[J].Science China Information Sciences,2012,55(3):590-599.

相关文章 15

[1]	郭艳卿, 李宇航, 王湾湾, 付海燕, 吴铭侃, 李祎. 基于联邦学习的Gamma回归算法 FL-GRM:Gamma Regression Algorithm Based on Federated Learning 计算机科学, 2022, 49(12): 66-73. https://doi.org/10.11896/jsjkx.220600034
[2]	金玉杰, 初旭, 王亚沙, 赵俊峰. 变分推断域适配驱动的城市街景语义分割 Variational Domain Adaptation Driven Semantic Segmentation of Urban Scenes 计算机科学, 2022, 49(11): 126-133. https://doi.org/10.11896/jsjkx.220500193
[3]	张宇欣, 陈益强. 基于多尺度特征融合的驾驶员注意力分散检测方法 Driver Distraction Detection Based on Multi-scale Feature Fusion Network 计算机科学, 2022, 49(11): 170-178. https://doi.org/10.11896/jsjkx.211000040
[4]	宋美琦, 傅湘玲, 闫晨巍, 仵伟强, 任芸. 基于双向长短时记忆网络的企业弹性能力预测模型 Prediction Model of Enterprise Resilience Based on Bi-directional Long Short-term Memory Network 计算机科学, 2022, 49(11): 197-205. https://doi.org/10.11896/jsjkx.210900195
[5]	钟坤华, 陈芋文, 秦小林. 基于子网融合的贝叶斯网络结构学习算法 Sub-BN-Merge Based Bayesian Network Structure Learning Algorithm 计算机科学, 2022, 49(11A): 210800172-7. https://doi.org/10.11896/jsjkx.210800172
[6]	任双艳, 郭威, 范昌琪, 王喆, 吴松洋. 基于类间和类内密度的多视角距离度量学习 Multi-view Distance Metric Learning with Inter-class and Intra-class Density 计算机科学, 2022, 49(11A): 211000131-6. https://doi.org/10.11896/jsjkx.211000131
[7]	徐晖, 王中卿, 李寿山, 张民. 结合情感信息的个性化对话生成 Personalized Dialogue Generation Integrating Sentimental Information 计算机科学, 2022, 49(11A): 211100019-6. https://doi.org/10.11896/jsjkx.211100019
[8]	岑健铭, 封全喜, 张丽丽, 佟锐超. 基于DE-lightGBM模型的上市公司高送转预测实证研究 Empirical Study on the Forecast of Large Stock Dividends of Listed Companies Based on DE-lightGBM 计算机科学, 2022, 49(11A): 211000017-7. https://doi.org/10.11896/jsjkx.211000017
[9]	孙开伟, 郭豪, 曾雅苑, 方阳, 刘期烈. 一种基于超网络的多目标回归方法 Multi-target Regression Method Based on Hypernetwork 计算机科学, 2022, 49(11A): 211000205-9. https://doi.org/10.11896/jsjkx.211000205
[10]	吴晓雯, 郑巧仙, 徐鑫强. 改进蚁群算法求解多目标单边装配线平衡问题 Improved Ant Colony Algorithm for Solving Multi-objective Unilateral Assembly Line Balancing Problem 计算机科学, 2022, 49(11A): 210900165-5. https://doi.org/10.11896/jsjkx.210900165
[11]	王茂光, 冀昊悦, 王天明. 一种基于层次聚类和模拟退火的选择性集成算法的风控模型研究 Study on Risk Control Model of Selective Ensemble Algorithm Based on Hierarchical Clustering and Simulated Annealing 计算机科学, 2022, 49(11A): 210800105-7. https://doi.org/10.11896/jsjkx.210800105
[12]	戴小路, 汪廷华, 周慧颖. 基于加权马氏距离的模糊多核支持向量机 Fuzzy Multiple Kernel Support Vector Machine Based on Weighted Mahalanobis Distance 计算机科学, 2022, 49(11A): 210800216-5. https://doi.org/10.11896/jsjkx.210800216
[13]	徐伟华, 张俊杰, 陈修伟. 带关注度模糊序决策数据集的分布约简 Distribution Reduction in Fuzzy Order Decision Data Sets with Attention Degree 计算机科学, 2022, 49(11A): 210700191-5. https://doi.org/10.11896/jsjkx.210700191
[14]	王志强, 郑婷婷, 孙鑫, 李清. 基于一种新的q-rung orthopair模糊交叉熵的属性约简算法 Attribute Reduction Algorithm Based on a New q-rung orthopair Fuzzy Cross Entropy 计算机科学, 2022, 49(11A): 211200142-6. https://doi.org/10.11896/jsjkx.211200142
[15]	冉虹, 候婷, 贺龙雨, 秦克云. 基于模糊邻域系统的模糊粗糙集模型 Fuzzy Rough Sets Model Based on Fuzzy Neighborhood Systems 计算机科学, 2022, 49(11A): 211100224-5. https://doi.org/10.11896/jsjkx.211100224

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed