计算机科学 ›› 2022, Vol. 49 ›› Issue (5): 43-49.doi: 10.11896/jsjkx.210400047

• 计算机图形学&多媒体* 上一篇    下一篇

面向事件相机的时间信息融合网络框架

徐化池1, 史殿习1,2,3, 崔玉宁2, 景罗希2, 刘聪2   

  1. 1 国防科技创新研究院 北京100071
    2 国防科技大学计算机学院 长沙410073
    3 天津(滨海)人工智能创新中心 天津300457
  • 收稿日期:2021-04-06 修回日期:2021-08-04 出版日期:2022-05-15 发布日期:2022-05-06
  • 通讯作者: 史殿习(dxshi@nudt.edu.cn)
  • 作者简介:(hcxu97@outlook.com)
  • 基金资助:
    国家重点研发计划(2017YFB1001901);天津市智能制造专项资金项目(20181108)

Time Information Integration Network for Event Cameras

XU Hua-chi1, SHI Dian-xi1,2,3, CUI Yu-ning2, JING Luo-xi2, LIU Cong2   

  1. 1 National Innovation Institute of Defense Technology,Beijing 100071,China
    2 College of Computer,National University of Defense Technology,Changsha 410073,China
    3 Tianjin Artificial Intelligence Innovation Center,Tianjin 300457,China
  • Received:2021-04-06 Revised:2021-08-04 Online:2022-05-15 Published:2022-05-06
  • About author:XU Hua-chi,born in 1997,postgra-duate.His main research interests include eventcamera,deep learning and object detection.
    SHI Dian-xi,born in 1966,Ph.D,researcher,Ph.D supervisor,is a member of China Computer Federation.His main research interests include artificial intelligence,distributed computing,cloud computing and big data proces-sing etc.
  • Supported by:
    National Key Research and Development Program of China(2017YFB1001901) and Tian-jin Intelligent Manufacturing Special Fund Project(20181108).

摘要: 事件相机是一种启发式传感器,它通过感知光线强度变化输出事件,响应异步和稀疏事件形式的像素级亮度变化,缓解了传统相机在光线条件变化复杂和物体高速运动场景下成像不清晰的问题。最近,基于学习的模式识别方法将事件相机的输出转化为伪图像的表示形式,在光流估计、目标识别等视觉任务中取得了巨大的进步。但是,这类方法丢弃了事件流之间的时间相关性,导致伪图像的纹理不够清晰,特征提取困难。为此,提出了基于事件流划分算法的神经网络框架,显式地融合了事件流的时间信息。该框架将输入的事件流划分成多份,使用权重分配网络给每一份事件流赋予不同的权重,并使其通过卷积神经网络融合时空信息、提取高级特征,最后对输入分类。在N-Caltech101和N-Cars数据集上进行的对比实验表明,与现有最先进算法相比,所提框架在分类准确率上有明显的提升。

关键词: 卷积神经网络, 权重分配, 融合, 时间信息, 事件流

Abstract: Event cameras are asynchronous sensors that operate in a completely different way from traditional cameras.Rather than catching pictures at a steady rate,event cameras measure light changes (called events) separately for every pixel.As a sequence,it alleviates the problems of traditional cameras in complex light conditions and scenes where objects move at high speed.With the development of convolutional neural networks,learning-based pattern recognition methods have made great progress in visual tasks such as optical flow estimation and target recognition by converting the output of the event camera into a pseudo-ima-ge representation.However,such methods abandon the temporal correlation between the event streams,so that the texture of the pseudo image is not clear enough,and it is difficult to extract the features.The key to solving this problem lies in how to model relevant information between events in the sample.Therefore,a neural network framework based on event stream partition algorithm is proposed,which explicitly integrates the temporal information of event streams.The framework divides the incoming stream of events into several parts,and a weight distribution network assigns different weights to each piece of the streams.Then,the framework uses convolutional neural network to fuse temporal information and extract advanced features.Finally,the input sample is classified.We thoroughly validate the proposed framework on object recognition.Comparison experiments on N-Caltech101 and N-cars datasets show that the proposed framework has a significant improvement in classification accuracy compared with the most advanced existing algorithms.

Key words: Convolutional neural network, Event streams, Fusion, Temporal information, Weight allocation

中图分类号: 

  • TP183
[1]GALLEGO G,DELBRUCK T,ORCHARD G,et al.Event-based vision:A survey[J].arXiv:1904.08405,2019.
[2]SANG Y G,LI R H,LI Y Q,et al.Research on neuromorphic vision sensor and its applications[J].Chinese Journal on Internet of Things,2019,3(4):63-71.
[3]LICHTSTEINER P,POSCH C,DELBRUCK T.A 128×128120 dB 15 μs Latency Asynchronous Temporal Contrast Vision Sensor[J].IEEE Journal of Solid-state Circuits,2008,43(2):566-576.
[4]LI J N,TIAN Y H.Recent Advances in Neuromorphic Vision Sensors:A Survey[J].Chinese Journal of Computers,2021,44(6):1258-1286.
[5]CUI Y,SHI D,ZHANG Y,et al.Selective Feature Network for Object Detection[C]//2020 International Joint Conference on Neural Networks (IJCNN).IEEE,2020:1-8.
[6]CUI Y,SHI D,ZHANG Y,et al.IDNet:A Single-Shot Object Detector Based on Feature Fusion[C]//2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI).IEEE,2020:1137-1144.
[7]ORCHARD G,MEYER C,ETIENNE-CUMMINGS R,et al.HFirst:a temporal approach to object recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(10):2028-2040.
[8]YAN C D,WANG X,ZUO Y F,et al.Visualization and noise reduction algorithm based on event camera[J].Journal of Beijing University of Aeronautics and Astronautics,2021,47(2):342-350.
[9]LAGORCE X,ORCHARD G,GALLUPPI F,et al.Hots:a hierarchy of event-based time-surfaces for pattern recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(7):1346-1359.
[10]GALLEGO G,LUND J E A,MUEGGLER E,et al.Event-based,6-DOF camera tracking from photometric depth maps[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(10):2402-2412.
[11]JIANG M,LIU Z,YU L.A Denoising Algorithm for Eent Came-ras Based on Low-Dimensional Manifold Constraint[J].Journal of Signal Processing,2019,35(10):1753-1761.
[12]KIM H,LEUTENEGGER S,DAVISON A J.Real-time 3D reconstruction and 6-DoF tracking with an event camera[C]//European Conference on Computer Vision.Cham:Springer,2016:349-364.
[13]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[14]SIRONI A,BRAMBILLA M,BOURDIS N,et al.HATS:Histograms of averaged time surfaces for robust event-based object classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1731-1740.
[15]ZHOU X L,LIU Q Q,CHAN S X,et al.Event Camera-based Visual Tracking Algorithms:a Survey[J].Journal of Chinese Computer Systems,2020,41(11):2325-2332.
[16]MAQUEDA A I,LOQUERCIO A,GALLEGO G,et al.Event-based vision meets deep learning on steering prediction for self-driving cars[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5419-5427.
[17]GEHRIG D,LOQUERCIO A,DERPANIS K G,et al.End-to-end learning of representations for asynchronous event-based data[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:5633-5643.
[18]MA Y Y,YE Z H,LIU K H,et al.Event-based Visual Localization and Mapping Algorithms:A Survey[J].ACTA Automatica Sinica,2020,46:1-11.
[19]ZHU A Z,YUAN L,CHANEY K,et al.EV-FlowNet:Self-supervised optical flow estimation for event-based cameras[J].arXiv:1802.06898,2018.
[20]REBECQ H,HORSTSCHÄFER T,GALLEGO G,et al.Evo:A geometric approach to event-based 6-dof parallel tracking and mapping in real time[J].IEEE Robotics and Automation Letters,2016,2(2):593-600.
[21]REBECQ H,RANFTL R,KOLTUN V,et al.Events-to-video:Bringing modern computer vision to event cameras[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:3857-3866.
[22]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[23]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[24]LEE J H,DELBRUCK T,PFEIFFER M.Training deep spiking neural networks using backpropagation[J].Frontiers in Neuroscience,2016,10:508.
[25]PÉREZ-CARRASCO J A,ZHAO B,SERRANO C,et al.Mapping from frame-driven to frame-free event-driven vision systems by low-rate rate coding and coincidence processing-application to feedforward ConvNets[J].IEEE transactions on Pattern Analysis and Machine Intelligence,2013,35(11):2706-2719.
[26]AMIR A,TABA B,BERG D,et al.A low power,fully event-based gesture recognition system[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7243-7252.
[27]ORCHARD G,BENOSMAN R,ETIENNE-CUMMINGS R,et al.A spiking neural network architecture for visual motion estimation[C]//2013 IEEE Biomedical Circuits and Systems Conference (BioCAS).IEEE,2013:298-301.
[28]NEIL D,PFEIFFER M,LIU S C.Phased lstm:Accelerating recurrent network training for long or event-based sequences[J].Advances in Neural Information Processing Systems,2016,29:3882-3890.
[29]SHRESTHA S B,ORCHARD G.Slayer:Spike layer error reassignment in time[J].arXiv:1810.08646,2018.
[30]ACHARYA J,PADALA V,BASU A.Spiking neural network based region proposal networks for neuromorphic vision sensors[C]//2019 IEEE International Symposium on Circuits and Systems (ISCAS).IEEE,2019:1-5.
[31]ZHU A Z,YUAN L,CHANEY K,et al.Unsupervised event-based learning of optical flow,depth,and egomotion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:989-997.
[32]REBECQ H,RANFTL R,KOLTUN V,et al.High speed and high dynamic range video with an event camera[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(6):1964-1980.
[33]CANNICI M,CICCONE M,ROMANONI A,et al.A differen-tiable recurrent surface for asynchronous event-based data[C]//European Conference on Computer Vision.Cham:Springer,2020:136-152.
[34]MESSIKOMMER N,GEHRIG D,LOQUERCIO A,et al.Event-based Asynchronous Sparse Convolutional Networks[J].arXiv:2003.09148,2020.
[35]GALLEGO G,FORSTER C,MUEGGLER E,et al.Event-based camera pose tracking using a generative event model[J].arXiv:1510.01972,2015.
[36]TULYAKOV S,FLEURET F,KIEFEL M,et al.Learning anevent sequence embedding for dense event-based deep stereo[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:1527-1537.
[37]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[38]ORCHARD G,JAYAWANT A,COHEN G K,et al.Converting static image datasets to spiking neuromorphic datasets using saccades[J].Frontiers in Neuroscience,2015,9:437.
[39]LI F F,FERGUS R,PERONA P.One-shot learning of object categories[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(4):594-611.
[40]POSCH C,MATOLIN D,WOHLGENANNT R.A QVGA 143 dB dynamic range frame-free PWM image sensor with lossless pixel-level video compression and time-domain CDS[J].IEEE Journal of Solid-State Circuits,2010,46(1):259-275.
[41]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[42]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167,2015.
[43]RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[44]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[45]RAMESH B,YANG H,ORCHARD G M,et al.Dart:distribution aware retinal transform for event-based cameras[J].IEEE transactions on Pattern Analysis and Machine Intelligence,2020,12(1):2767-2780.
[46]CANNICI M,CICCONE M,ROMANONI A,et al.Asynchro-nous convolutional networks for object detection in neuromorphic cameras[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2019.
[47]DENG Y,LI Y,CHEN H.AMAE:Adaptive Motion-AgnosticEncoder for Event-Based Object Classification[J].IEEE Robotics and Automation Letters,2020,5(3):4596-4603.
[1] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[2] 曹晓雯, 梁美玉, 鲁康康.
基于细粒度语义推理的跨媒体双路对抗哈希学习模型
Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model
计算机科学, 2022, 49(9): 123-131. https://doi.org/10.11896/jsjkx.220600011
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[6] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[7] 秦琪琦, 张月琴, 王润泽, 张泽华.
基于知识图谱的层次粒化推荐方法
Hierarchical Granulation Recommendation Method Based on Knowledge Graph
计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111
[8] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[9] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[10] 魏恺轩, 付莹.
基于重参数化多尺度融合网络的高效极暗光原始图像降噪
Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising
计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179
[11] 沈祥培, 丁彦蕊.
多检测器融合的深度相关滤波视频多目标跟踪算法
Multi-detector Fusion-based Depth Correlation Filtering Video Multi-target Tracking Algorithm
计算机科学, 2022, 49(8): 184-190. https://doi.org/10.11896/jsjkx.210600004
[12] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[13] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[14] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[15] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!