Computer Science ›› 2024, Vol. 51 ›› Issue (11A): 231200167-9.doi: 10.11896/jsjkx.231200167

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Real-time Accurate Object Tracking for Resource-constrained Edge Devices

ZHANG Xinyi, TAN Guang   

  1. School of Intelligent Systems Engineering,Sun Yat-Sen University,Shenzhen,Guangdong 510275,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:ZHANG Xinyi,born in 1998,postgra-duate.Her main research interests include object detection and tracking,and their applications in mobile computing.
    TAN Guang,born in 1978,Ph.D,professor,is a member of the CCF(No.19464M).His main research interests include design and evaluation of networked systems and machine learning.
  • Supported by:
    National Natural Science Foundation of China(62372488).

Abstract: Real-time video analysis tasks often involve running computationally intensive deep neural network(DNN) models for object tracking.In practical applications,offloading multi-stream video analysis tasks to edge devices near the cameras has become crucial.However,these edge devices often have limited computing resources,resulting in low tracking accuracy.This is primarily due to outdated detection results,accumulated tracking errors,and the inability to detect new object.To address these issues,a prediction-correction based framework is proposed.The framework comprises three core components:1)Predictive detection propagation,which rapidly updates outdated object bounding boxes using a lightweight prediction model to match the current frame.2)Frame difference corrector,which refines bounding boxes based on frame difference information.3)New object detector,which discovers newly appearing objects during the tracking process by clustering frame difference features.Experimental results demonstrate that the framework achieves accuracy improvements ranging from 19.4% to 34.7% compared to baseline methods across various traffic scenarios while maintaining real-time execution speed.

Key words: Edge device, Resource efficiency, Object detection, Object tracking

CLC Number: 

  • TP391.4
[1]CHEN J,WANG Q,CHENG H H,et al.A review of vision-based traffic semantic understanding in ITSs[J].IEEE Transactions on Intelligent Transportation Systems,2022.
[2]YI J,CHOI S,LEE Y.EagleEye:Wearable camera-based person identification in crowded urban spaces[C]//Proceedings of the 26th Annual International Conference on Mobile Computing and Networking.2020:1-14
[3]EMAMI P,ELEFTERIADOU L,RANKA S.Long-range multi-object tracking at traffic intersections on low-power devices[J].IEEE Transactions on Intelligent Transportation Systems,2021,23(3):2482-2493.
[4]LIU L,LI H,GRUTESER M.Edge assisted real-time object detection for mobile augmented reality[C]//The 25th Annual International Conference on Mobile Computing and Networking.2019:1-16.
[5]BHARDWAJ R,XIA Z,ANANTHANARAYANAN G,et al.Ekya:Continuous learning of video analytics models on edge compute servers[C]//19th USENIX Symposium on Networked Systems Design and Implementation(NSDI 22).2022:119-135.
[6]LI Y,PADMANABHAN A,ZHAO P,et al.Reducto:On-camera filtering for resource-efficient real-time video analytics[C]//Proceedings of the Annual Conference of the ACM Special Inte-rest Group on Data Communication on the Applications,Techno-logies,Architectures,and Protocols for Computer Communication.2020:359-376.
[7]YANG K,YI J,LEE K,et al.FlexPatch:Fast and Accurate Ob-ject Detection for On-device High-Resolution Live Video Analytics[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.IEEE,2022:1898-1907.
[8]CHEN T Y H,RAVINDRANATH L,DENG S,et al.Glimpse:Continuous,real-time object recognition on mobile devices[C]//Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems.2015:155-168.
[9]APICHARTTRISORN K,RAN X,CHEN J,et al.Frugal following:Power thrifty object detection and tracking for mobile augmented reality[C]//Proceedings of the 17th Conference on Embedded Networked Sensor Systems.2019:96-109.
[10]YOLOv5[OL].https://github.com/ultralytics/yolov5.
[11]KANG D,EMMONS J,ABUZAID F,et al.Noscope:optimizing neural network queries over video at scale[C]//Proceedings of the VLDB Endowment,2017:1586-1597.
[12]ZHANG S,WANG C,JIN Y,et al.Adaptive configuration selection and bandwidth allocation for edge-based video analytics[J].IEEE/ACM Transactions on Networking,2021,30(1):285-298.
[13]TCHAYE-KONDI J,ZHAI Y,SHEN J,et al.Smartfilter:Anedge system for real-time application-guided video frames filtering[J].IEEE Internet of Things Journal,2022,9(23):23772-23785.
[14]MOLL O,BASTANI F,MADDEN S,et al.Exsample:Efficient searches on video repositories through adaptive sampling[C]//2022 IEEE 38th International Conference on Data Engineering(ICDE).IEEE,2022:2956-2968.
[15]XU R,ZHANG C,WANG P,et al.ApproxDet:content andcontention-aware approximate object detection for mobiles[C]//Proceedings of the 18th Conference on Embedded Networked Sensor Systems.2020:449-462.
[16]CAO J,HADIDI R,ARULRAJ J,et al.Thia:Accelerating video analytics using early inference and fine-grained query planning[J].arXiv:2102.08481,2021.
[17]BASTANI F,MADDEN S.OTIF:Efficient tracker pre-processing over large video datasets[C]//Proceedings of the 2022 International Conference on Management of Data.2022:2091-2104.
[18]HWANG J,KIM M,KIM D,et al.{CoVA}:Exploiting {Com-pressed-Domain} Analysis to Accelerate Video Analytics[C]//2022 USENIX Annual Technical Conference(USENIX ATC 22).2022:707-722.
[19]ZHANG H,ANANTHANARAYANAN G,BODIK P,et al.Live video analytics at scale with approximation and {Delay-Tolerance}[C]//14th USENIX Symposium on Networked Systems Design and Implementation(NSDI 17).2017:377-392.
[20]HUNG C C,ANANTHANARAYANAN G,BODIK P,et al.Videoedge:Processing camera streams using hierarchical clusters[C]//2018 IEEE/ACM Symposium on Edge Computing(SEC).IEEE,2018:115-131.
[21]ZHANG M,WANG F,LIU J.Casva:Configuration-adaptivestreaming for live video analytics[C]//IEEE INFOCOM 2022-IEEE Conference on Computer Communications.IEEE,2022:2168-2177.
[22]YUAN T,MI L,WANG W,et al.AccDecoder:Accelerated Decoding for Neural-enhanced Video Analytics[C]//IEEE INFOCOM 2023-IEEE Conference on Computer Communications.2023:1-10.
[23]YI S,HAO Z,ZHANG Q,et al.Lavea:Latency-aware video analytics on edge computing platform[C]//Proceedings of the Second ACM/IEEE Symposium on Edge Computing.2017:1-13.
[24]CHENG L,WANG J,LI Y.Vitrack:Efficient tracking on theedge for commodity video surveillance systems[J].IEEE Transactions on Parallel and Distributed Systems,2021,33(3):723-735.
[25]ZHANG W,HE Z,LIU L,et al.Elf:accelerate high-resolution mobile deep vision with content-aware parallel offloading[C]//Proceedings of the 27th Annual International Conference on Mobile Computing and Networking.2021:201-214.
[26]LIU S,WANG T,LI J,et al.Adamask:Enabling machine-centric video streaming with adaptive frame masking for dnn infe-rence offloading[C]//Proceedings of the 30th ACM International Conference on Multimedia.2022:3035-3044.
[27]KONG Y,YANG P,CHENG Y.Edge-assisted on-device model update for video analytics in adverse environments[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:9051-9060.
[28]DONG X,SHEN J,WANG W,et al.Dynamical hyperparameter optimization via deep reinforcement learning in tracking[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,43(5):1515-1529.
[29]DONG X,SHEN J,PORIKLI F,et al.Adaptive siamese tracking with a compact latent network[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(7):8049-8062.
[30]DEO N,WOLFF E,BEIJBOM O.Multimodal trajectory prediction conditioned on lane-graph traversals[C]//Conference on Robot Learning.PMLR,2022:203-212.
[31]CHOI D,MIN K W.Hierarchical latent structure for multi-modal vehicle trajectory forecasting[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:129-145.
[32] CORSEL C W,VAN LIER M,KAMPMEIJER L,et al.Exploiting Temporal Context for Tiny Object Detection[C]//Procee-dings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2023:79-89.
[33]LV X,WANG Q,YU C,et al.A Feedback-Driven DNN Infe-rence Acceleration System for Edge-Assisted Video Analytics[J].IEEE Transactions on Computers,2023,72(10):2902-2912.
[34]GUPTA A,JOHNSON J,FEI-FEI L,et al.Social gan:Socially acceptable trajectories with generative adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2255-2264.
[35] ALAHI A,GOEL K,RAMANATHAN V,et al.Social lstm:Human trajectory prediction in crowded spaces[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:961-971.
[36] ORGHIDAN R,SALVI J,GORDAN M,et al.Camera calibration using two or three vanishing points[C]//2012 Federated Conference on Computer Science and Information Systems(FedCSIS).IEEE,2012:123-130.
[37]TaNG Z,WANG G,XIAO H,et al.Single-camera and inter-camera vehicle tracking and 3D speed estimation based on fusion of visual and semantic features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2018:108-115.
[38]CHIARA L F,COSCIA P,DAS S,et al.Goal-driven self-attentive recurrent networks for trajectory prediction[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:2518-2527.
[39]ZHANG K,FENG X,WU L,et al.Trajectory prediction for autonomous driving using spatial-temporal graph attention transformer[J].IEEE Transactions on Intelligent Transportation Systems,2022,23(11):22343-22353.
[40]LEE N,CHOI W,VERNAZA P,et al.Desire:Distant futureprediction in dynamic scenes with interacting agents[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:336-345.
[41]SINGLA N.Motion detection based on frame difference method[J].International Journal of Information & Computation Technology,2014,4(15):1559-1565.
[42]ZHENG D,ZHANG Y,XIAO Z.Deep learning-driven gaussian modeling and improved motion detection algorithm of the three-frame difference method[J].Mobile Information Systems,2021,2021(1):9976623:1-9976623:7.
[43]SCHUBERT E,SANDER J,ESTER M,et al.DBSCAN revisi-ted,revisited:why and how you should(still) use DBSCAN[J].ACM Transactions on Database Systems(TODS),2017,42(3):1-21.
[44]DU K,PERVAIZ A,YUAN X,et al.Server-driven videostreaming for deep learning inference[C]//Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications,Technologies,Architectures,and Protocols for Computer Communication.2020:557-570.
[1] WANG Jiahui, PENG Guangling, DUAN Liang, YUAN Guowu, YUE Kun. Few-shot Shadow Removal Method for Text Recognition [J]. Computer Science, 2024, 51(9): 147-154.
[2] LI Yunchen, ZHANG Rui, WANG Jiabao, LI Yang, WANG Ziqi, CHEN Yao. Re-parameterization Enhanced Dual-modal Realtime Object Detection Model [J]. Computer Science, 2024, 51(9): 162-172.
[3] PU Bin, LIANG Zhengyou, SUN Yu. Monocular 3D Object Detection Based on Height-Depth Constraint and Edge Fusion [J]. Computer Science, 2024, 51(8): 192-199.
[4] LOU Zhengzheng, ZHANG Xin, HU Shizhe, WU Yunpeng. Foggy Weather Object Detection Method Based on YOLOX_s [J]. Computer Science, 2024, 51(7): 206-213.
[5] SUN Ziwen, YUAN Guanglin, LI Congli, QIN Xiaoyan, ZHU Hong. Object Tracking of Structured SVM Based on DIoU Loss and Smoothness Constraints [J]. Computer Science, 2024, 51(6A): 230700113-8.
[6] ZHENG Shenhai, GAO Xi, LIU Pengwei, LI Weisheng. Occluded Video Instance Segmentation Method Based on Feature Fusion of Tracking and Detection in Time Sequence [J]. Computer Science, 2024, 51(6A): 230600186-6.
[7] LIU Hongli, WANG Yulin, SHAO Lei, LI Ji. Study on Monocular Vision Vehicle Ranging Based on Lower Edge of Detection Frame [J]. Computer Science, 2024, 51(6A): 231000077-6.
[8] CHEN Yuzhang, WANG Shiqi, ZHOU Wen, ZHOU Wanting. Small Object Detection for Fish Based on SPD-Conv and NAM Attention Module [J]. Computer Science, 2024, 51(6A): 230500176-7.
[9] QUE Yue, GAN Menghan, LIU Zhiwei. Object Detection with Receptive Field Expansion and Multi-branch Aggregation [J]. Computer Science, 2024, 51(6A): 230600151-6.
[10] JIAO Ruodan, GAO Donghui, HUANG Yanhua, LIU Shuo, DUAN Xuanfei, WANG Rui, LIU Weidong. Study and Verification on Few-shot Evaluation Methods for AI-based Quality Inspection in Production Lines [J]. Computer Science, 2024, 51(6A): 230700086-8.
[11] LI Yuehao, WANG Dengjiang, JIAN Haifang, WANG Hongchang, CHENG Qinghua. LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction [J]. Computer Science, 2024, 51(6): 215-222.
[12] LIAO Junshuang, TAN Qinhong. DETR with Multi-granularity Spatial Attention and Spatial Prior Supervision [J]. Computer Science, 2024, 51(6): 239-246.
[13] LIU Jiasen, HUANG Jun. Center Point Target Detection Algorithm Based on Improved Swin Transformer [J]. Computer Science, 2024, 51(6): 264-271.
[14] BAI Xuefei, SHEN Wucheng, WANG Wenjian. Salient Object Detection Based on Feature Attention Purification [J]. Computer Science, 2024, 51(5): 125-133.
[15] WU Xiaoqin, ZHOU Wenjun, ZUO Chenglin, WANG Yifan, PENG Bo. Salient Object Detection Method Based on Multi-scale Visual Perception Feature Fusion [J]. Computer Science, 2024, 51(5): 143-150.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!