Computer Science ›› 2026, Vol. 53 ›› Issue (1): 153-162.doi: 10.11896/jsjkx.250300021

• Computer Graphics & Multimedia • Previous Articles     Next Articles

EvR-DETR:Event-RGB Fusion for Lightweight End-to-End Object Detection

ZHOU Bingquan, JIANG Jie, CHEN Jiangmin, ZHAN Lixin   

  1. College of System Engineering, National University of Defense Technology, Changsha 410073, China
  • Received:2025-03-04 Revised:2025-05-08 Published:2026-01-08
  • About author:ZHOU Bingquan,born in 2000,postgraduate, professor.His main research interests include computer vision and event-based vision.
    JIANG Jie,born in 1974,Ph.D, professor.His main research interests include artificial intelligence and deep learning,visualization and visual analytics,virtual reality and intelligent interaction.

Abstract: Event cameras based on neuromorphic spike signals can provide information about illumination changes,compensating for the performance degradation of traditional RGB cameras in object detection under adverse environments.However,existing methods fusing event cameras with conventional cameras suffer from large model parameters and non-end-to-end training approaches,which restrict the effectiveness of modality fusion.To address this,this paper proposes a lightweight end-to-end object detection framework that integrates event and RGB information through multi-granularity fusion of multi-scale features across different network levels.By implementing lightweight fusion modules with reparameterized convolutions and enabling end-to-end training,the proposed framework enhances the model’s capability to extract complementary information from both modalities,overcoming challenging conditions in autonomous driving.Evaluated on the large-scale PKU-SOD dataset containing vehicular visual data under low-light,high-speed motion blur,and normal illumination scenarios,the proposed method significantly reduces model parameters compared to state-of-the-art multimodal approaches while improving detection accuracy and inference speed,demonstrating superior performance over existing methods.

Key words: Object detection, Neuromorphic camera, Autonomous driving, Deep learning, End-to-end object detection, Event-basedobject detection, Light-weight object detection

CLC Number: 

  • TP391
[1]LIU L,OUYANG W,WANG X,et al.Deep Learning for Gene-ric Object Detection:A Survey[J].International Journal of Computer Vision,2020,128(2):261-318.
[2]GALLEGO G,DELBRÜCK T,ORCHARD G,et al.Event-Based Vision:A Survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(1):154-180.
[3]CHEN G,CAO H,CONRADT J,et al.Event-Based Neuromorphic Vision for Autonomous Driving:A Paradigm Shift for Bio-Inspired Visual Sensing and Perception[J].IEEE Signal Processing Magazine,2020,37(4):34-49.
[4]BERNER R,BRANDLI C,YANG M,et al.A latency sparseoutput vision sensor for mobile applications[C]//2013 Symposium on VLSI Circuits.2013:C186-C187.
[5] WANG L,LIU Z,SHI D X,et al.Fusion Tracker:Single-object Tracking Framework Fusing Image Features and Event Features[J].Computer Science,2023,50(10):96-103.
[6]JIANG J,ZHOU B,ZHOU T,et al.Deep Event-Based Object Detection in Autonomous Driving:A Survey[C]//2024 10th International Conference on Big Data and Information Analytics(BigDIA).2024:447-454.
[7]JIANG Z,XIA P,HUANG K,et al.Mixed Frame-/Event-Dri-ven Fast Pedestrian Detection[C]//2019 International Confe-rence on Robotics and Automation(ICRA).2019:8332-8338.
[8]LI J,DONG S,YU Z,et al.Event-Based Vision Enhanced:AJoint Detection Framework in Autonomous Driving[C]//2019 IEEE International Conference on Multimedia and Expo(ICME).2019:1396-1401.
[9]TOMY A,PAIGWAR A,MANN K S,et al.Fusing Event-based and RGB camera for Robust Object Detection in Adverse Conditions[C]//2022 International Conference on Robotics and Automation(ICRA).2022:933-939.
[10]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An Image is Worth 16x16 Words:Transformers for Image Recognition at Scale[C]//International Conference on Learning Representations.2021.
[11]LIU M,QI N,SHI Y,et al.An Attention Fusion Network For Event-Based Vehicle Object Detection[C]//IEEE International Conference on Image Processing.IEEE,2021.
[12]ZHOU Z,WU Z,BOUTTEAU R,et al.RGB-Event Fusion for Moving Object Detection in Autonomous Driving[C]// 2023 IEEE International Conference on Robotics and Automation(ICRA).2023:7808-7815.
[13]LI D,TIAN Y,LI J.SODFormer:Streaming Object Detection With Transformer Using Events and Frames[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(11):14020-14037.
[14]ZHAO Y,LYU W,XU S,et al.DETRs Beat YOLOs on Real-time Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2024:16965-16974.
[15] CARION N,MASSA F,SYNNAEVE G,et al.End-to-End Object Detection with Transformers[C]//Computer Vision-ECCV 2020.Cham:Springer,2020:213-229.
[16]BERNER R,BRANDLI C,YANG M,et al.A 240×180 10mW 12us latency sparseoutput vision sensor for mobile applications[C]//2013 Symposium on VLSI Circuits.2013:C186-C187.
[17]HE K,ZHANG X,REN S,et al.Deep Residual Learning forImage Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:770-778.
[18]LI F,ZHANG H,LIU S,et al.DN-DETR:Accelerate DETRTraining by Introducing Query DeNoising[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46(4):2239-2251.
[19]DING X,ZHANG X,MA N,et al.RepVGG:Making VGG-style ConvNets Great Again[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:13728-13737.
[20]ZHU X,SU W,LU L,et al.Deformable DETR:DeformableTransformers for End-to-End Object Detection[J].arXiv:2010.04159,2020.
[21]IACONO M,WEBER S,GLOVER A,et al.Towards Event-Driven Object Detection with Off-the-Shelf Deep Learning[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).2018:1-9.
[22]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[23]LI H,WU X J,KITTLER J.MDLatLRR:A Novel Decomposi-tion Method for Infrared and Visible Image Fusion[J].IEEE Transactions on Image Processing,2020,29:4733-4746.
[1] HUANG Miaomiao, WANG Huiying, WANG Meixia, WANG Yejiang , ZHAO Yuhai. Review of Graph Embedding Learning Research:From Simple Graph to Complex Graph [J]. Computer Science, 2026, 53(1): 58-76.
[2] WANG Cheng, JIN Cheng. KAN-based Unsupervised Multivariate Time Series Anomaly Detection Network [J]. Computer Science, 2026, 53(1): 89-96.
[3] XUE Jingyan, XIA Jianan, HUO Ruili, LIU Jie, ZHOU Xuezhong. Review of Retinal Image Analysis Methods for OCT/OCTA Based on Deep Learning [J]. Computer Science, 2026, 53(1): 128-140.
[4] LI Fangfang, KONG Yuqiu, LIU Yang , LI Pengyue. Co-salient Object Detection Guided by Category Labels [J]. Computer Science, 2026, 53(1): 163-172.
[5] LI Ang, ZHANG Jieyuan, LIU Xunyun. Camouflaged Object Detection for Aerial Images Based on Bidirectional Cross-attentionCross-domain Fusion [J]. Computer Science, 2026, 53(1): 173-179.
[6] FAN Jiabin, WANG Baohui, CHEN Jixuan. Method for Symbol Detection in Substation Layout Diagrams Based on Text-Image MultimodalFusion [J]. Computer Science, 2026, 53(1): 206-215.
[7] LIU Wei, XU Yong, FANG Juan, LI Cheng, ZHU Yujun, FANG Qun, HE Xin. Multimodal Air-writing Gesture Recognition Based on Radar-Vision Fusion [J]. Computer Science, 2025, 52(9): 259-268.
[8] YIN Shi, SHI Zhenyang, WU Menglin, CAI Jinyan, YU De. Deep Learning-based Kidney Segmentation in Ultrasound Imaging:Current Trends and Challenges [J]. Computer Science, 2025, 52(9): 16-24.
[9] ZENG Lili, XIA Jianan, LI Shaowen, JING Maike, ZHAO Huihui, ZHOU Xuezhong. M2T-Net:Cross-task Transfer Learning Tongue Diagnosis Method Based on Multi-source Data [J]. Computer Science, 2025, 52(9): 47-53.
[10] LI Yaru, WANG Qianqian, CHE Chao, ZHU Deheng. Graph-based Compound-Protein Interaction Prediction with Drug Substructures and Protein 3D Information [J]. Computer Science, 2025, 52(9): 71-79.
[11] LUO Chi, LU Lingyun, LIU Fei. Partial Differential Equation Solving Method Based on Locally Enhanced Fourier NeuralOperators [J]. Computer Science, 2025, 52(9): 144-151.
[12] LIU Leyuan, CHEN Gege, WU Wei, WANG Yong, ZHOU Fan. Survey of Data Classification and Grading Studies [J]. Computer Science, 2025, 52(9): 195-211.
[13] LIU Zhengyu, ZHANG Fan, QI Xiaofeng, GAO Yanzhao, SONG Yijing, FAN Wang. Review of Research on Deep Learning Compiler [J]. Computer Science, 2025, 52(8): 29-44.
[14] TANG Boyuan, LI Qi. Review on Application of Spatial-Temporal Graph Neural Network in PM2.5 ConcentrationForecasting [J]. Computer Science, 2025, 52(8): 71-85.
[15] SHEN Tao, ZHANG Xiuzai, XU Dai. Improved RT-DETR Algorithm for Small Object Detection in Remote Sensing Images [J]. Computer Science, 2025, 52(8): 214-221.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!