计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 231000013-6.doi: 10.11896/jsjkx.231000013
胡刚, 梁栋, 黄圣君
HU Gang, LIANG Dong, HUANG Shengjun
摘要: 事件相机具有高时间分辨率、高动态范围和低功耗等特性,通常被用于传统相机应用受限场景(高速度、强光、弱光等)下的目标检测任务中。然而由于事件相机的像素异步性,其输出的事件序列难以进行人工标注,为此现有方法通过RGB图像标记迁移得到事件序列标记。然而,迁移标记中存在大量噪声标记和事件序列中部分目标纹理模糊,导致难以取得理想的模型性能。为了解决此问题,提出了一种跨模态噪声过滤的事件相机目标检测算法。算法利用预训练后的事件相机检测器对开源RGB目标检测数据集进行筛选,得到对训练事件相机检测器最具价值的RGB图像和事件图像一起构成跨模态混合图像,帮助检测器更准确地识别、定位事件图像目标;为了缓解噪声标记对检测器性能的影响,设计了一种多阶段目标检测联合优化策略,单个阶段训练完成时,在全局标记中识别噪声标记,并对噪声标记进行修正后在下一阶段使用。实验结果表明,在1Mpx Detection Dataset上,与基准模型相比,跨模态噪声过滤的事件相机目标检测算法提供了8.35%的模型增益,远优于Co-teaching,O2U-net等噪声标签学习方法,具体地,跨模态混合图像训练、联合优化框架分别提供了6.44%,4.77%的模型增益。
中图分类号:
[1]WANG L,LIU Z,SHI D X,et al.Fusion Tracker:Single-object Tracking Framework Fusing Image Features and Event Features[J].Computer Science,2023,50(10):96-103. [2]HAN J,YANG Y,ZHOU C,et al.EvIntSR-Net:Event guided multiple latent frames reconstruction and super-resolution[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:4882-4891. [3]XU Q,DENG J,SHEN J R,et al.A Review of Image Reconstruction Based on Event Cameras[J].Journal of Electronics & Information Technology, 2023,45(8):2699-2709. [4]LICHTSTEINERP,POSCH C,DelBruck T.A 128× 128 120 dB 15 μs Latency Asynchronous Temporal Contrast Vision Sensor[J].IEEE Journal of Solid-State Circuits,2008,43(2):566-576. [5]GALLEGO G,DELBRÜCK T,ORCHARD G,et al.Event-based vision:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(1):154-180. [6]SABATER A,MONTESANO L,MURILLO A C.Event Transformer.A sparse-aware solution for efficient event data proces-sing[C]//Proceedings of the IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2022:2677-2686. [7]WAN J,XIA M,HUANG Z,et al.Event-Based Pedestrian Detection Using Dynamic Vision Sensors[J].Electronics,2021,10(8):888. [8]MIAO S,CHEN G,NING X,et al.Neuromorphic vision datasets for pedestrian detection,action recognition,and fall detection[J].Frontiers in Neurorobotics,2019,13:38. [9]HE D C,WANG L.Texture unit,texture spectrum,and texture analysis[J].IEEE transactions on Geoscience and Remote Sen-sing,1990,28(4):509-512. [10]PEROT E,DE TOURNEMIRE P,NITTI D,et al.Learning to detect objects with a 1 megapixel event camera[J].Advances in Neural Information Processing Systems,2020,33:16639-16652. [11]FINATEU T,NIWA A,MATOLIN D,et al.5.10 a 1280× 720 back-illuminated stacked temporal contrast event-based vision sensor with 4.86-m pixels,1.066 GEPS readout,programmable event-rate controller and compressive data-formatting pipeline[C]//2020 IEEE International Solid-State Circuits Conference(ISSCC).IEEE,2020. [12]HUANG J,QU L,JIA R,et al.O2u-net:A simple noisy label detection approach for deep neural networks[C]//Proceedings of the IEEE/CVF Cnternational Conference on Computer Vision.2019:3326-3334. [13]HAN B,YAO Q,YU X,et al.Co-teaching:Robust training of deep neural networks with extremely noisy labels[J].arXiv:1804.06872,2018. [14]TANAKA D,IKAMI D,YAMASAKIT,et al.Joint optimization framework for learning with noisy labels[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5552-5560. [15]LI J,XIONG C,SOCHER R,et al.Towards noise-resistant object detection with noisy annotations[J].arXiv:2003.01285,2020. [16]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.Yolov4:Optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020. [17]LIU K,QIAN X,WANG Z Q.Survey on active learning algorithms[J] Computer Engineering and Applications,2012,48(34):1-4. [18]XIE Y,TOMIZUKA M,ZHAN W.Towards general and efficient active learning[J].arXiv:2112.07963,2021. [19]PAN S J,YANG Q.A survey on transfer learning[J].IEEE Transactions on Knowledge and Data Engineering,2009,22(10):1345-1359. [20]GANIN Y,USTINOVA E,AJAKAN H,et al.Domain-adversarial training of neural networks[J].The journal of machine learning research,2016,17(1):2096-2030. [21]JIANG J,CHEN B,WANG J,et al.Decoupled adaptation for cross-domain object detection[J].arXiv:2110.02578,2021. [22]VAN DER AALST W M P,RUBIN V,VERBEEK H M W,et al.Process mining:a two-step approach to balance between underfitting and overfitting[J].Software & Systems Modeling,2010,9:87-111. [23]YU F,CHEN H,WAN G X,et al.Bdd100k:A diverse driving dataset for heterogeneous multitask learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:2636-2645. [24]HAN J,LIANG X,XU H,et al.SODA10M:a large-scale 2Dself/Semi-supervised object detection dataset for autonomous driving[J].arXiv:2106.11118,2021. [25]TARVAINEN A,HARRI V.Mean teachers are better rolemodels:Weight-averaged consistency targets improve semi-supervised deep learning results[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:1195-1204. |
|