Computer Science ›› 2024, Vol. 51 ›› Issue (6): 215-222.doi: 10.11896/jsjkx.230500085

• Computer Graphics & Multimedia • Previous Articles     Next Articles

LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction

LI Yuehao1,2, WANG Dengjiang3, JIAN Haifang1, WANG Hongchang1,2, CHENG Qinghua1,2   

  1. 1 Laboratory of Solid State Optoelectronics Information Technology,Institute of Semiconductors,Chinese Academy of Sciences,Beijing 100083,China
    2 College of Materials Science and Opto-Electronic Technology,University of Chinese Academy of Sciences,Beijing 101499,China
    3 Beijing VanJee Technology Suzhou R&D Institute,Suzhou,Jiangsu 215133,China
  • Received:2023-05-12 Revised:2023-09-12 Online:2024-06-15 Published:2024-06-05
  • About author:LI Yuehao,born in 2000,master.His main research interests is multi-modal fusion algorithm.
    JIAN Haifang,born in 1978,Ph.D,researcher,is a member of CCF(No.O2087M).His main research interest is intelligent information processing algorithms and systems.
  • Supported by:
    Scientific and Technological Innovation 2030-“New Generation Artificial Intelligence” Major Project(2022ZD0116300).

Abstract: Beam attenuation and target occlusion in the working environment of LiDAR can cause the output point cloud to be sparse at the far end,which leads to the phenomenon of detection accuracy degradation with distance for 3D object detection algorithms based on LiDAR.To address this problem,a LiDAR-radar fusion object detection algorithm based on BEV occupancy prediction is proposed.First,a simplified bird’s eye view(BEV) occupancy prediction sub-network is proposed to generate position-related radar features,which also helps to solve the network convergence difficulty problem caused by the sparsity of radar data.Then,in order to achieve cross-modal feature fusion,a multi-scale LiDAR-radar fusion layer based on BEV space feature correlation is designed.Experimental results on the nuScenes dataset show that the mean average precision(mAP) of the proposed radar branch network reaches 21.6%,and the inference time is 8.3ms.After adding the fusion layer structure,the mAP of the multi-modal detection algorithm improves by 2.9%,compared to the baseline algorithm CenterPoint,and the additional inference time overhead is only 8.6ms.At the 30m position of the distance sensor,the detection accuracy of the multi-modal algorithm for 10 categories in the nuScenes dataset increases by 2.1%~16.0% compared to CenterPoint respectively.

Key words: 3D Object detection, LiDAR, Radar, Occupancy prediction, Bird’s eye view, Feature fusion

CLC Number: 

  • TP391
[1]KIM S H,HWANG Y.A survey on deep learning based me-thods and datasets for monocular 3d object detection[J].Electro-nics,2021,10(4):517.
[2]HAN B,ZHANG X Y,REN S.Survey of convolution operations based on 3D point clouds[J].Journal of Computer Research and Development,2023,60(4):873-902.
[3]CAESAR H,BANKITI V,LANG A H,et al.Nuscenes:A multimodal dataset for autonomous driving[C]//Proceedings of the 33rd IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2020:11618-11628.
[4]QI C R,SU H,MO K,et al.PointNet:Deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Re-cognition.Piscataway,NJ:IEEE,2017:55-63.
[5]QI C R,YI L,SU H,et al.PointNet++:Deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of the 31st Int Conference on Neural Information Processing Systems.Cambridge,MA:MIT Press,2017:5105-5114.
[6]SHI S,WANG X,LI H.PointRCNN:3D object proposal generation and detection from point cloud[C]//Proceedings of the 32nd IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2019:770-779.
[7]ZHANG Y,HU Q,XU G,et al.Not all points are equal:Lear-ning highly efficient point-based detectors for 3D LiDAR point clouds[C]//Proceedings of the 35th IEEE Conference on Computer Vision and PatternRecognition.Piscata-way,NJ:IEEE,2022:18931-18940.
[8]YA Z Y,TUZEL O.VoxelNet:End-to-end learning for point cloud based 3D object detection[C]//Proceedings of the 31st IEEE Conference on Computer Vision andPattern Recognition.Piscataway,NJ:IEEE,2018:4490-4499.
[9]YAN Y,MAO Y,LI B.Second:Sparsely embedded convolu-tional detection[J].Sensors,2018,18(10):3337.
[10]YIN T,ZHOU X,KRÄHENBÜHL P.Center-based 3D object detection and tracking[C]//Proceedings of the 35th IEEE Conference on Computer Vision and Pattern Recognition.Pisca-taway,NJ:IEEE,2021:11779-11788.
[11]DENG J,SHI S,LI P,et al.Voxel R-CNN:Towards high performance voxel-based 3D object detection.[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI,2021:1201-1209.
[12]LANG A H,VORA S,CAESAR H,et al.Pointpillars:Fast encoders for object detection from point clouds[C]//Proceedings of the 32nd IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2019:12689-12697.
[13]LI Y,SHEN T,ZENG K.3D object detection based on attention fusion of millimeter wave radar point cloud and visual information disparity features[J].Journal of Optoelectronics·Laser,2023,34(1):26-33.
[14]DREHER M,ERÇELIK E,BÄNZIGER T,et al.Radar-based 2D car detection using deep neural networks[C]//Proceedings of the 23rd IEEE International Conference on Intelligent Transportation Systems Piscataway.NJ:IEEE,2020:1-8.
[15]YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[16]QI C R,LIU W,WU C,et al.Frustum pointnets for 3D object detection from RGB-D data[C]//Proceedings of the 31st IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2018:918-927.
[17]MICHAEL U,SASCHA B,DANIEL K,et al.Improved orientation estimation and detection with hybrid object detection networks for automotive radar[C]//Proceedings of the 23rd IEEE International Conference on Intelligent Transportation Systems Piscataway.NJ:IEEE,2022:918-927.
[18]THOMAS H,QI C R,DESCHAUD J E,et al.KPConv:Flexible and deformable convolution for point clouds[C]//Proceedings of the 17th IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE,2019:6410-6419.
[19]NABATI R,QI H.RRPN:Radar region proposal network for object detection in autonomous vehicles[C]//Proceedings of the IEEE International Conference on Image Processing.Pisca-taway,NJ:IEEE,2019:3093-3097.
[20]CHADWICK S,MADDETN W,NEWMAN P.Distant vehicle detection using radar and vision[C]//Proceedings of the 2019 International Conference on Robotics and Automation.Pisca-taway.NJ:IEEE,2019:8311-8317.
[21]NOBIS F,GEISSLINGER M,WEBER M,et al.A deep learning-based radar and camera sensor fusion architecture for object detection[C]//Proceedings of the 2019 IEEE Symposium on Sensor Data Fusion:Trends,Solutions,Applications.Piscataway,NJ:IEEE,2019:1-7.
[22]BIJELIC M,GRUBER T,MANNAN F,et al.Seeing through fog without seeing fog:Deep multimodal sensor fusion in unseen adverse weather[C]//Proceedings of the 34th IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2020:11679-11689.
[23]TAO H Z,YI N S,JUN J C,et al.Behind the curtain:Learning occluded shapes for 3D object detection[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI,2022:11910-11919.
[1] GAO Nan, ZHANG Lei, LIANG Ronghua, CHEN Peng, FU Zheng. Scene Text Detection Algorithm Based on Feature Enhancement [J]. Computer Science, 2024, 51(6): 256-263.
[2] WU Xiaoqin, ZHOU Wenjun, ZUO Chenglin, WANG Yifan, PENG Bo. Salient Object Detection Method Based on Multi-scale Visual Perception Feature Fusion [J]. Computer Science, 2024, 51(5): 143-150.
[3] JIAN Yingjie, YANG Wenxia, FANG Xi, HAN Huan. 3D Object Detection Based on Edge Convolution and Bottleneck Attention Module for Point Cloud [J]. Computer Science, 2024, 51(5): 162-171.
[4] HONG Tijing, LIU Dengfeng, LIU Yian. Radar Active Jamming Recognition Based on Multiscale Fully Convolutional Neural Network and GRU [J]. Computer Science, 2024, 51(5): 306-312.
[5] SHAN Xinxin, LI Kai, WEN Ying. Medical Image Segmentation Network Integrating Full-scale Feature Fusion and RNN with Attention [J]. Computer Science, 2024, 51(5): 100-107.
[6] ZHOU Yu, CHEN Zhihua, SHENG Bin, LIANG Lei. Multi Scale Progressive Transformer for Image Dehazing [J]. Computer Science, 2024, 51(5): 117-124.
[7] BAI Xuefei, SHEN Wucheng, WANG Wenjian. Salient Object Detection Based on Feature Attention Purification [J]. Computer Science, 2024, 51(5): 125-133.
[8] XUE Jinqiang, WU Qin. Progressive Multi-stage Image Denoising Algorithm Combining Convolutional Neural Network and
Multi-layer Perceptron
[J]. Computer Science, 2024, 51(4): 243-253.
[9] ZHANG Yang, XIA Ying. Object Detection Method with Multi-scale Feature Fusion for Remote Sensing Images [J]. Computer Science, 2024, 51(3): 165-173.
[10] QIAO Fan, WANG Peng, WANG Wei. Multivariate Time Series Classification Algorithm Based on Heterogeneous Feature Fusion [J]. Computer Science, 2024, 51(2): 36-46.
[11] ZHANG Guodong, CHEN Zhihua, SHENG Bin. Infrared Small Target Detection Based on Dilated Convolutional Conditional GenerativeAdversarial Networks [J]. Computer Science, 2024, 51(2): 151-160.
[12] ZHAO Jiangfeng, HE Hongjie, CHEN Fan, YANG Shubin. Two-stage Visible Watermark Removal Model Based on Global and Local Features for Document Images [J]. Computer Science, 2024, 51(2): 172-181.
[13] ZHANG Hongwang, ZHOU Rui, CHENG Yu, LIU Chenxu. Cross-scene Gesture Recognition Based on Point Cloud Trajectories and Compressed Doppler [J]. Computer Science, 2024, 51(2): 182-188.
[14] CHEN Guojun, YUE Xueyan, ZHU Yanning, FU Yunpeng. Study on Building Extraction Algorithm of Remote Sensing Image Based on Multi-scale Feature Fusion [J]. Computer Science, 2023, 50(9): 202-209.
[15] ZHOU Fengfan, LING Hefei, ZHANG Jinyuan, XIA Ziwei, SHI Yuxuan, LI Ping. Facial Physical Adversarial Example Performance Prediction Algorithm Based on Multi-modal Feature Fusion [J]. Computer Science, 2023, 50(8): 280-285.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!