计算机科学 ›› 2023, Vol. 50 ›› Issue (1): 131-137.doi: 10.11896/jsjkx.211100097

• 计算机图形学&多媒体 • 上一篇    下一篇

基于互相关注意力的链式帧处理多目标跟踪算法

陈云芳, 陆洋洋, 周鑫, 张伟   

  1. 南京邮电大学计算机学院 南京 210023
  • 收稿日期:2021-11-08 修回日期:2022-06-30 出版日期:2023-01-15 发布日期:2023-01-09
  • 通讯作者: 张伟(zhangw@njupt.edu.cn)
  • 作者简介:chenyf@njupt.edu.cn
  • 基金资助:
    国家重点研发计划(2019YFB2101700)

Multi-object Tracking Based on Cross-correlation Attention and Chained Frames

CHEN Yunfang, LU Yangyang, ZHOU Xin, ZHANG Wei   

  1. School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
  • Received:2021-11-08 Revised:2022-06-30 Online:2023-01-15 Published:2023-01-09
  • About author:CHEN Yunfang,born in 1976,Ph.D,master supervisor.His main research interests include artificial intelligence algorithms,functional analysis of specific application areas,application deve-lopment using intelligent systems.
    ZHANG Wei,born in 1973,Ph.D,Ph.D supervisor.His main research interests include intelligent perception and cognition under UAV platform,privacy protection and artificial intelligence security.
  • Supported by:
    National Key R & D Program of China(2019YFB2101700).

摘要: 多目标跟踪的一阶段方法因其在推理速度方面的优势逐渐成为主流。然而,与两阶段方法相比,其跟踪精度较差。一方面是因为采用单幅图像输入,目标间的关联性不强,容易导致目标丢失,另一方面忽视了检测和跟踪两个任务之间的差异性。为了减轻上述限制,提出了一种基于互相关注意力的链式帧处理多目标跟踪算法(MOT-CCC)。MOT-CCC将连续的两帧图片作为输入,将目标关联问题转化为两帧检测框对回归的问题,增强了目标间的关联性;采用互相关注意力模块将检测任务和身份识别任务解耦,以平衡并减少这两个任务之间的竞争。此外,所提算法将目标检测、特征提取和数据关联3个模块融合到一个网络中,实现了端到端的优化,提高了跟踪准确性,减少了跟踪耗时。在MOT16和MOT17基准测试中,MOT-CCC比原有的基准CTracker算法的MOTA提高了1.3%,FP减少了13%。

关键词: 多目标跟踪, 链式跟踪器, 互相关注意力, 一阶段方法, 端到端

Abstract: The one-stage method of multi-object tracking(MOT) has gradually become the mainstream of MOT due to its advantages in reasoning speed.However,compared with the two-stage method,its tracking accuracy is poor.One reason is that the target is easy to be lost due to the use of single frame input that cause the correlation between the targets is not strong,the other is that the difference between the two tasks of detection and tracking is ignored.In order to alleviate the limitations,a multi-object tracking algorithm based on cross-correlation attention and chained frames(MOT-CCC) is proposed.MOT-CCC takes two consecutive frames as input,and converts the target association problem into a two-frame detection frame pair regression problem,which enhances the correlation between targets.The cross-correlation attention module decouples the detection task and the identification task to balance and reduce the competition between the two tasks.In addition,the proposed algorithm integrates the three modules of target detection,feature extraction and data association into one whole network to achieve end-to-end optimization,which improves tracking accuracy and reduces tracking time.In the MOT16 and MOT17 benchmark tests,compared with the benchmark CTracker algorithm,the MOTA of MOT-CCC increases by 1.3% and the FP decreases by 13%.

Key words: Multi-object tracking, Chained tracker, Cross-correlation attention, One-shot, End-to-End

中图分类号: 

  • TP391
[1]LEE B,ERDENEE E,JIN S,et al.Multi-class multi-objecttracking using changing point detection[C]//European Confe-rence on Computer Vision.Cham:Springer,2016:68-83.
[2]WOJKE N,BEWLEY A,PAULUS D.Simple online and real-time tracking with a deep association metric[C]//2017 IEEE International Conference on Image Processing(ICIP).IEEE,2017:3645-3649.
[3]FANG K,XIANG Y,LI X,et al.Recurrent autoregressive networks for online multi-object tracking[C]//2018 IEEE Winter Conference on Applications of Computer Vision(WACV).IEEE,2018:466-475.
[4]FARHADI A,REDMON J.Yolov3:An incremental improve-ment[C]//Computer Vision and Pattern Recognition.Berlin/Heidelberg,Germany:Springer,2018:1804-02.
[5]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in Neural Information Processing Systems,2015,28:91-99.
[6]GONG X,LE Z C,WNAG H,et al.Survey of Data Association Technology in Multi-target Tracking[J].Computer Science,2020,47(10):136-144.
[7]SADEGHIAN A,ALAHI A,SAVARESE S.Tracking the untrackable:Learning to track multiple cues with long-term dependencies[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2017:300-311.
[8]REZATOFIGHI S H,MILAN A,ZHANG Z,et al.Joint probabilistic data association revisited[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:3047-3055.
[9]KIM C,LI F,CIPTADI A,et al.Multiple hypothesis trackingrevisited[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2015:4696-4704.
[10]LEAL-TAIXÉ L,CANTON-FERRER C,SCHINDLER K.Learning by tracking:Siamese CNN for robust target association[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2016:33-40.
[11]SUN S J,AKHTAR N,SONG H S,et al.Deep affinity network for multiple object tracking[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,43(1):104-119.
[12]MAHMOUDI N,AHADI S M,RAHMATI M.Multi-targettracking using CNN-based features:CNNMTT[J].Multimedia Tools and Applications,2019,78(6):7077-7096.
[13]BAE S H,YOON K J.Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(3):595-610.
[14]BERGMANN P,MEINHARDT T,LEAL-TAIXE L.Tracking without bells and whistles[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:941-951.
[15]WANG Z,ZHENG L,LIU Y,et al.Towards real-time multi-object tracking[C]//Computer Vision-ECCV 2020:16th European Conference,Glasgow,UK(Part XI 16).Springer International Publishing,2020:107-122.
[16]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[17]ZHANG Y,WANG C,WANG X,et al.A simple baseline for multi-object tracking[J].arXiv:2004.01888,2020.
[18]CHEN L,AI H,ZHUANG Z,et al.Real-time multiple people tracking with deeply learned candidate selection and person re-identification[C]//2018 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2018:1-6.
[19]KUHN H W.The Hungarian method for the assignment problem[J].Naval Research Logistics Quarterly,1955,2(1/2):83-97.
[20]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[21]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shot multibox detector[C]//European Conference on Computer vision.Cham:Springer,2016:21-37.
[22]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D,et al.Object detection with discriminatively trained part-based models[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,32(9):1627-1645.
[23]BERNARDIN K,STIEFELHAGEN R.Evaluating multiple object tracking performance:the clear mot metrics[J].EURASIP Journal on Image and Video Processing,2008,2008:1-10.
[24]HE K,ZHANG X,REN S,et al.Delving deep into rectifiers:Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1026-1034.
[25]YU F,LI W,LI Q,et al.Poi:Multiple object tracking with high performance detection and appearance feature[C]//European Conference on Computer Vision.Cham:Springer,2016:36-42.
[26]PENG J,WANG C,WAN F,et al.Chained-tracker:Chainingpaired attentive regression results for end-to-end joint multiple-object detection and tracking[C]//European Conference on Computer Vision.Cham:Springer,2020:145-161.
[27]KIM C,LI F,REHG J M.Multi-object tracking with neural gating using bilinear lstm[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:200-215.
[28]CHEN J,SHENG H,ZHANG Y,et al.Enhancing detectionmodel for multiple hypothesis tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2017:18-27.
[29]ZHU J,YANG H,LIU N,et al.Online multi-object tracking with dual matching attention networks[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:366-382.
[30]CHOI W.Near-online multi-target tracking with aggregated local flow descriptor[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:3029-3037.
[31]KEUPER M,TANG S,ANDRES B,et al.Motion segmentation &multiple object tracking by correlation co-clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,42(1):140-153.
[1] 沈祥培, 丁彦蕊.
多检测器融合的深度相关滤波视频多目标跟踪算法
Multi-detector Fusion-based Depth Correlation Filtering Video Multi-target Tracking Algorithm
计算机科学, 2022, 49(8): 184-190. https://doi.org/10.11896/jsjkx.210600004
[2] 李荪, 曹峰.
智能语音技术端到端框架模型分析和趋势研究
Analysis and Trend Research of End-to-End Framework Model of Intelligent Speech Technology
计算机科学, 2022, 49(6A): 331-336. https://doi.org/10.11896/jsjkx.210500180
[3] 文成宇, 房卫东, 陈伟.
多目标跟踪的对象初始化综述
Object Initialization in Multiple Object Tracking:A Review
计算机科学, 2022, 49(3): 152-162. https://doi.org/10.11896/jsjkx.210200048
[4] 杨润延, 程高峰, 刘建.
基于端到端语音识别的关键词检索技术研究
Study on Keyword Search Framework Based on End-to-End Automatic Speech Recognition
计算机科学, 2022, 49(1): 53-58. https://doi.org/10.11896/jsjkx.210800269
[5] 张鹏, 王新晴, 肖毅, 段宝国, 许鸿辉.
基于语义边缘驱动的实时双目深度估计算法
Real-time Binocular Depth Estimation Algorithm Based on Semantic Edge Drive
计算机科学, 2021, 48(9): 216-222. https://doi.org/10.11896/jsjkx.200800203
[6] 刘东, 王叶斐, 林建平, 马海川, 杨闰宇.
端到端优化的图像压缩技术进展
Advances in End-to-End Optimized Image Compression Technologies
计算机科学, 2021, 48(3): 1-8. https://doi.org/10.11896/jsjkx.201100134
[7] 蒋琪, 苏伟, 谢莹, 周弘安平, 张久文, 蔡川.
基于Transformer的汉字到盲文端到端自动转换
End-to-End Chinese-Braille Automatic Conversion Based on Transformer
计算机科学, 2021, 48(11A): 136-141. https://doi.org/10.11896/jsjkx.210100025
[8] 刘彦, 秦品乐, 曾建朝.
基于YOLOv3与分层数据关联的多目标跟踪算法
Multi-object Tracking Algorithm Based on YOLOv3 and Hierarchical Data Association
计算机科学, 2021, 48(11A): 370-375. https://doi.org/10.11896/jsjkx.201000115
[9] 龚轩, 乐孜纯, 王慧, 武玉坤.
多目标跟踪中的数据关联技术综述
Survey of Data Association Technology in Multi-target Tracking
计算机科学, 2020, 47(10): 136-144. https://doi.org/10.11896/jsjkx.200200041
[10] 花明, 李冬冬, 王喆, 高大启.
基于帧级特征的端到端说话人识别
End-to-End Speaker Recognition Based on Frame-level Features
计算机科学, 2020, 47(10): 169-173. https://doi.org/10.11896/jsjkx.190800054
[11] 胡海根, 周莉莉, 周乾伟, 陈胜勇, 张俊康.
基于CNN的相衬显微图像序列的癌细胞多目标跟踪
Multi-target Tracking of Cancer Cells under Phase Contrast Microscopic Images Based
on Convolutional Neural Network
计算机科学, 2019, 46(5): 279-285. https://doi.org/10.11896/j.issn.1002-137X.2019.05.043
[12] 王正宁, 周阳, 吕侠, 曾凡伟, 张翔, 张锋军.
一种基于2D和3D联合信息的改进MDP跟踪算法
Improved MDP Tracking Method by Combining 2D and 3D Information
计算机科学, 2019, 46(3): 97-102. https://doi.org/10.11896/j.issn.1002-137X.2019.03.013
[13] 金欢欢,尹海波,何玲娜.
端到端单通道睡眠EEG自动分期模型
End-to-End Single-channel Automatic Staging Model for Sleep EEG Signal
计算机科学, 2019, 46(3): 242-247. https://doi.org/10.11896/j.issn.1002-137X.2019.03.036
[14] 赵广辉, 卓松, 徐晓龙.
基于卡尔曼滤波的多目标跟踪方法
Multi-object Tracking Algorithm Based on Kalman Filter
计算机科学, 2018, 45(8): 253-257. https://doi.org/10.11896/j.issn.1002-137X.2018.08.045
[15] 袁大龙,纪庆革.
协同运动状态估计的多目标跟踪算法
Multiple Object Tracking Algorithm via Collaborative Motion Status Estimation
计算机科学, 2017, 44(Z11): 154-159. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.032
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!