计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 231200167-9.doi: 10.11896/jsjkx.231200167
张莘沂, 谭光
ZHANG Xinyi, TAN Guang
摘要: 实时视频分析任务通常涉及到运行计算密集型的深度神经网络模型来实现目标跟踪。在实际应用中,将多路视频数据分析任务卸载到摄像机附近的边缘设备上进行处理变得尤为重要。然而,这些边缘设备的计算资源通常非常有限,导致目标跟踪的精度较差。这主要是由过时的检测结果、跟踪错误积累以及无法感知新目标造成的。针对上述问题,提出了一种基于预测和修正的检测跟踪框架。该框架中包含了3个核心的组件:1)预测性检测传播:通过轻量级预测模型快速更新过时的对象边界框以匹配当前帧;2)帧差修正器:基于帧差信息将出现误差的目标框回归到正确位置;3)新目标检测器:在跟踪过程中通过对帧差特征进行聚类发现新出现的目标。实验结果表明,相比基线方法,该框架在不同的交通场景中取得了19.4%到34.7%的精度提升,同时保持了实时的运行速度。
中图分类号:
[1]CHEN J,WANG Q,CHENG H H,et al.A review of vision-based traffic semantic understanding in ITSs[J].IEEE Transactions on Intelligent Transportation Systems,2022. [2]YI J,CHOI S,LEE Y.EagleEye:Wearable camera-based person identification in crowded urban spaces[C]//Proceedings of the 26th Annual International Conference on Mobile Computing and Networking.2020:1-14 [3]EMAMI P,ELEFTERIADOU L,RANKA S.Long-range multi-object tracking at traffic intersections on low-power devices[J].IEEE Transactions on Intelligent Transportation Systems,2021,23(3):2482-2493. [4]LIU L,LI H,GRUTESER M.Edge assisted real-time object detection for mobile augmented reality[C]//The 25th Annual International Conference on Mobile Computing and Networking.2019:1-16. [5]BHARDWAJ R,XIA Z,ANANTHANARAYANAN G,et al.Ekya:Continuous learning of video analytics models on edge compute servers[C]//19th USENIX Symposium on Networked Systems Design and Implementation(NSDI 22).2022:119-135. [6]LI Y,PADMANABHAN A,ZHAO P,et al.Reducto:On-camera filtering for resource-efficient real-time video analytics[C]//Proceedings of the Annual Conference of the ACM Special Inte-rest Group on Data Communication on the Applications,Techno-logies,Architectures,and Protocols for Computer Communication.2020:359-376. [7]YANG K,YI J,LEE K,et al.FlexPatch:Fast and Accurate Ob-ject Detection for On-device High-Resolution Live Video Analytics[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.IEEE,2022:1898-1907. [8]CHEN T Y H,RAVINDRANATH L,DENG S,et al.Glimpse:Continuous,real-time object recognition on mobile devices[C]//Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems.2015:155-168. [9]APICHARTTRISORN K,RAN X,CHEN J,et al.Frugal following:Power thrifty object detection and tracking for mobile augmented reality[C]//Proceedings of the 17th Conference on Embedded Networked Sensor Systems.2019:96-109. [10]YOLOv5[OL].https://github.com/ultralytics/yolov5. [11]KANG D,EMMONS J,ABUZAID F,et al.Noscope:optimizing neural network queries over video at scale[C]//Proceedings of the VLDB Endowment,2017:1586-1597. [12]ZHANG S,WANG C,JIN Y,et al.Adaptive configuration selection and bandwidth allocation for edge-based video analytics[J].IEEE/ACM Transactions on Networking,2021,30(1):285-298. [13]TCHAYE-KONDI J,ZHAI Y,SHEN J,et al.Smartfilter:Anedge system for real-time application-guided video frames filtering[J].IEEE Internet of Things Journal,2022,9(23):23772-23785. [14]MOLL O,BASTANI F,MADDEN S,et al.Exsample:Efficient searches on video repositories through adaptive sampling[C]//2022 IEEE 38th International Conference on Data Engineering(ICDE).IEEE,2022:2956-2968. [15]XU R,ZHANG C,WANG P,et al.ApproxDet:content andcontention-aware approximate object detection for mobiles[C]//Proceedings of the 18th Conference on Embedded Networked Sensor Systems.2020:449-462. [16]CAO J,HADIDI R,ARULRAJ J,et al.Thia:Accelerating video analytics using early inference and fine-grained query planning[J].arXiv:2102.08481,2021. [17]BASTANI F,MADDEN S.OTIF:Efficient tracker pre-processing over large video datasets[C]//Proceedings of the 2022 International Conference on Management of Data.2022:2091-2104. [18]HWANG J,KIM M,KIM D,et al.{CoVA}:Exploiting {Com-pressed-Domain} Analysis to Accelerate Video Analytics[C]//2022 USENIX Annual Technical Conference(USENIX ATC 22).2022:707-722. [19]ZHANG H,ANANTHANARAYANAN G,BODIK P,et al.Live video analytics at scale with approximation and {Delay-Tolerance}[C]//14th USENIX Symposium on Networked Systems Design and Implementation(NSDI 17).2017:377-392. [20]HUNG C C,ANANTHANARAYANAN G,BODIK P,et al.Videoedge:Processing camera streams using hierarchical clusters[C]//2018 IEEE/ACM Symposium on Edge Computing(SEC).IEEE,2018:115-131. [21]ZHANG M,WANG F,LIU J.Casva:Configuration-adaptivestreaming for live video analytics[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.IEEE,2022:2168-2177. [22]YUAN T,MI L,WANG W,et al.AccDecoder:Accelerated Decoding for Neural-enhanced Video Analytics[C]//IEEE INFOCOM 2023-IEEE Conference on Computer Communications.2023:1-10. [23]YI S,HAO Z,ZHANG Q,et al.Lavea:Latency-aware video analytics on edge computing platform[C]//Proceedings of the Second ACM/IEEE Symposium on Edge Computing.2017:1-13. [24]CHENG L,WANG J,LI Y.Vitrack:Efficient tracking on theedge for commodity video surveillance systems[J].IEEE Transactions on Parallel and Distributed Systems,2021,33(3):723-735. [25]ZHANG W,HE Z,LIU L,et al.Elf:accelerate high-resolution mobile deep vision with content-aware parallel offloading[C]//Proceedings of the 27th Annual International Conference on Mobile Computing and Networking.2021:201-214. [26]LIU S,WANG T,LI J,et al.Adamask:Enabling machine-centric video streaming with adaptive frame masking for dnn infe-rence offloading[C]//Proceedings of the 30th ACM International Conference on Multimedia.2022:3035-3044. [27]KONG Y,YANG P,CHENG Y.Edge-assisted on-device model update for video analytics in adverse environments[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:9051-9060. [28]DONG X,SHEN J,WANG W,et al.Dynamical hyperparameter optimization via deep reinforcement learning in tracking[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,43(5):1515-1529. [29]DONG X,SHEN J,PORIKLI F,et al.Adaptive siamese tracking with a compact latent network[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(7):8049-8062. [30]DEO N,WOLFF E,BEIJBOM O.Multimodal trajectory prediction conditioned on lane-graph traversals[C]//Conference on Robot Learning.PMLR,2022:203-212. [31]CHOI D,MIN K W.Hierarchical latent structure for multi-modal vehicle trajectory forecasting[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:129-145. [32] CORSEL C W,VAN LIER M,KAMPMEIJER L,et al.Exploiting Temporal Context for Tiny Object Detection[C]//Procee-dings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2023:79-89. [33]LV X,WANG Q,YU C,et al.A Feedback-Driven DNN Infe-rence Acceleration System for Edge-Assisted Video Analytics[J].IEEE Transactions on Computers,2023,72(10):2902-2912. [34]GUPTA A,JOHNSON J,FEI-FEI L,et al.Social gan:Socially acceptable trajectories with generative adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2255-2264. [35] ALAHI A,GOEL K,RAMANATHAN V,et al.Social lstm:Human trajectory prediction in crowded spaces[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:961-971. [36] ORGHIDAN R,SALVI J,GORDAN M,et al.Camera calibration using two or three vanishing points[C]//2012 Federated Conference on Computer Science and Information Systems(FedCSIS).IEEE,2012:123-130. [37]TaNG Z,WANG G,XIAO H,et al.Single-camera and inter-camera vehicle tracking and 3D speed estimation based on fusion of visual and semantic features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2018:108-115. [38]CHIARA L F,COSCIA P,DAS S,et al.Goal-driven self-attentive recurrent networks for trajectory prediction[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:2518-2527. [39]ZHANG K,FENG X,WU L,et al.Trajectory prediction for autonomous driving using spatial-temporal graph attention transformer[J].IEEE Transactions on Intelligent Transportation Systems,2022,23(11):22343-22353. [40]LEE N,CHOI W,VERNAZA P,et al.Desire:Distant futureprediction in dynamic scenes with interacting agents[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:336-345. [41]SINGLA N.Motion detection based on frame difference method[J].International Journal of Information & Computation Technology,2014,4(15):1559-1565. [42]ZHENG D,ZHANG Y,XIAO Z.Deep learning-driven gaussian modeling and improved motion detection algorithm of the three-frame difference method[J].Mobile Information Systems,2021,2021(1):9976623:1-9976623:7. [43]SCHUBERT E,SANDER J,ESTER M,et al.DBSCAN revisi-ted,revisited:why and how you should(still) use DBSCAN[J].ACM Transactions on Database Systems(TODS),2017,42(3):1-21. [44]DU K,PERVAIZ A,YUAN X,et al.Server-driven videostreaming for deep learning inference[C]//Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications,Technologies,Architectures,and Protocols for Computer Communication.2020:557-570. |
|