基于BEV占位预测的激光-毫米波雷达融合目标检测算法

doi:10.11896/jsjkx.230500085

计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 215-222.doi: 10.11896/jsjkx.230500085

• 计算机图形学&多媒体 • 上一篇下一篇

基于BEV占位预测的激光-毫米波雷达融合目标检测算法

李越豪^1,2, 王邓江³, 鉴海防¹, 王洪昌^1,2, 程清华^1,2

1 中国科学院半导体研究所固态光电信息技术实验室北京 100083
2 中国科学院大学材料科学与光电技术学院北京 101499
3 北京万集科技股份有限公司苏州研究院江苏苏州 215133

收稿日期:2023-05-12 修回日期:2023-09-12 出版日期:2024-06-15 发布日期:2024-06-05
通讯作者: 鉴海防(jhf@semi.ac.cn)
作者简介:(liyuehao22@mails.ucas.ac.cn)
基金资助:
科技创新2030－“新一代人工智能”重大项目(2022ZD0116300)

LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction

LI Yuehao^1,2, WANG Dengjiang³, JIAN Haifang¹, WANG Hongchang^1,2, CHENG Qinghua^1,2

1 Laboratory of Solid State Optoelectronics Information Technology,Institute of Semiconductors,Chinese Academy of Sciences,Beijing 100083,China
2 College of Materials Science and Opto-Electronic Technology,University of Chinese Academy of Sciences,Beijing 101499,China
3 Beijing VanJee Technology Suzhou R&D Institute,Suzhou,Jiangsu 215133,China

Received:2023-05-12 Revised:2023-09-12 Online:2024-06-15 Published:2024-06-05
About author:LI Yuehao,born in 2000,master.His main research interests is multi-modal fusion algorithm.
JIAN Haifang,born in 1978,Ph.D,researcher,is a member of CCF(No.O2087M).His main research interest is intelligent information processing algorithms and systems.
Supported by:
Scientific and Technological Innovation 2030－“New Generation Artificial Intelligence” Major Project(2022ZD0116300).

摘要/Abstract

摘要： 激光雷达工作环境中的光束衰减和目标遮挡会导致输出点云出现远端稀疏的问题,从而引起基于激光雷达的3D目标检测算法的检测精度随距离衰减的现象。针对这一问题,提出了一种基于鸟瞰图视角(BEV)空间内目标占位预测的激光-毫米波雷达融合目标检测算法。首先提出了一种简化的BEV占位预测子网络,用于生成位置相关的毫米波雷达特征,同时有助于解决毫米波雷达数据稀疏带来的网络收敛困难的问题。然后,为了实现跨模态特征融合,设计了一种基于BEV空间特征关联的多尺度激光-毫米波雷达特征融合层结构。在nuScenes数据集上进行实验,结果表明,所提出的毫米波雷达分支网络的平均检测精度(mAP)达到21.6%,推理时间为8.3ms。在加入融合层结构后,多模态检测算法较基线算法CenterPoint的mAP提升了2.9%,同时增加的额外推理时间开销仅为8.6ms,在距离传感器30m位置处,多模态算法对于nuScenes数据集中10个类别的检测精度达成率分别较CenterPoint提升了2.1%~16.0%。

关键词: 3D目标检测, 激光雷达, 毫米波雷达, 占位预测, 鸟瞰图视角, 特征融合

Abstract: Beam attenuation and target occlusion in the working environment of LiDAR can cause the output point cloud to be sparse at the far end,which leads to the phenomenon of detection accuracy degradation with distance for 3D object detection algorithms based on LiDAR.To address this problem,a LiDAR-radar fusion object detection algorithm based on BEV occupancy prediction is proposed.First,a simplified bird’s eye view(BEV) occupancy prediction sub-network is proposed to generate position-related radar features,which also helps to solve the network convergence difficulty problem caused by the sparsity of radar data.Then,in order to achieve cross-modal feature fusion,a multi-scale LiDAR-radar fusion layer based on BEV space feature correlation is designed.Experimental results on the nuScenes dataset show that the mean average precision(mAP) of the proposed radar branch network reaches 21.6%,and the inference time is 8.3ms.After adding the fusion layer structure,the mAP of the multi-modal detection algorithm improves by 2.9%,compared to the baseline algorithm CenterPoint,and the additional inference time overhead is only 8.6ms.At the 30m position of the distance sensor,the detection accuracy of the multi-modal algorithm for 10 categories in the nuScenes dataset increases by 2.1%~16.0% compared to CenterPoint respectively.

Key words: 3D Object detection, LiDAR, Radar, Occupancy prediction, Bird’s eye view, Feature fusion

中图分类号:

TP391

李越豪, 王邓江, 鉴海防, 王洪昌, 程清华. 基于BEV占位预测的激光-毫米波雷达融合目标检测算法[J]. 计算机科学, 2024, 51(6): 215-222. https://doi.org/10.11896/jsjkx.230500085

LI Yuehao, WANG Dengjiang, JIAN Haifang, WANG Hongchang, CHENG Qinghua. LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction[J]. Computer Science, 2024, 51(6): 215-222. https://doi.org/10.11896/jsjkx.230500085

参考文献

[1]KIM S H,HWANG Y.A survey on deep learning based me-thods and datasets for monocular 3d object detection[J].Electro-nics,2021,10(4):517.
[2]HAN B,ZHANG X Y,REN S.Survey of convolution operations based on 3D point clouds[J].Journal of Computer Research and Development,2023,60(4):873-902.
[3]CAESAR H,BANKITI V,LANG A H,et al.Nuscenes:A multimodal dataset for autonomous driving[C]//Proceedings of the 33rd IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2020:11618-11628.
[4]QI C R,SU H,MO K,et al.PointNet:Deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Re-cognition.Piscataway,NJ:IEEE,2017:55-63.
[5]QI C R,YI L,SU H,et al.PointNet++:Deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of the 31st Int Conference on Neural Information Processing Systems.Cambridge,MA:MIT Press,2017:5105-5114.
[6]SHI S,WANG X,LI H.PointRCNN:3D object proposal generation and detection from point cloud[C]//Proceedings of the 32nd IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2019:770-779.
[7]ZHANG Y,HU Q,XU G,et al.Not all points are equal:Lear-ning highly efficient point-based detectors for 3D LiDAR point clouds[C]//Proceedings of the 35th IEEE Conference on Computer Vision and PatternRecognition.Piscata-way,NJ:IEEE,2022:18931-18940.
[8]YA Z Y,TUZEL O.VoxelNet:End-to-end learning for point cloud based 3D object detection[C]//Proceedings of the 31st IEEE Conference on Computer Vision andPattern Recognition.Piscataway,NJ:IEEE,2018:4490-4499.
[9]YAN Y,MAO Y,LI B.Second:Sparsely embedded convolu-tional detection[J].Sensors,2018,18(10):3337.
[10]YIN T,ZHOU X,KRÄHENBÜHL P.Center-based 3D object detection and tracking[C]//Proceedings of the 35th IEEE Conference on Computer Vision and Pattern Recognition.Pisca-taway,NJ:IEEE,2021:11779-11788.
[11]DENG J,SHI S,LI P,et al.Voxel R-CNN:Towards high performance voxel-based 3D object detection.[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI,2021:1201-1209.
[12]LANG A H,VORA S,CAESAR H,et al.Pointpillars:Fast encoders for object detection from point clouds[C]//Proceedings of the 32nd IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2019:12689-12697.
[13]LI Y,SHEN T,ZENG K.3D object detection based on attention fusion of millimeter wave radar point cloud and visual information disparity features[J].Journal of Optoelectronics·Laser,2023,34(1):26-33.
[14]DREHER M,ERÇELIK E,BÄNZIGER T,et al.Radar-based 2D car detection using deep neural networks[C]//Proceedings of the 23rd IEEE International Conference on Intelligent Transportation Systems Piscataway.NJ:IEEE,2020:1-8.
[15]YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[16]QI C R,LIU W,WU C,et al.Frustum pointnets for 3D object detection from RGB-D data[C]//Proceedings of the 31st IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2018:918-927.
[17]MICHAEL U,SASCHA B,DANIEL K,et al.Improved orientation estimation and detection with hybrid object detection networks for automotive radar[C]//Proceedings of the 23rd IEEE International Conference on Intelligent Transportation Systems Piscataway.NJ:IEEE,2022:918-927.
[18]THOMAS H,QI C R,DESCHAUD J E,et al.KPConv:Flexible and deformable convolution for point clouds[C]//Proceedings of the 17th IEEE International Conference on Computer Vision.Piscataway,NJ:IEEE,2019:6410-6419.
[19]NABATI R,QI H.RRPN:Radar region proposal network for object detection in autonomous vehicles[C]//Proceedings of the IEEE International Conference on Image Processing.Pisca-taway,NJ:IEEE,2019:3093-3097.
[20]CHADWICK S,MADDETN W,NEWMAN P.Distant vehicle detection using radar and vision[C]//Proceedings of the 2019 International Conference on Robotics and Automation.Pisca-taway.NJ:IEEE,2019:8311-8317.
[21]NOBIS F,GEISSLINGER M,WEBER M,et al.A deep learning-based radar and camera sensor fusion architecture for object detection[C]//Proceedings of the 2019 IEEE Symposium on Sensor Data Fusion:Trends,Solutions,Applications.Piscataway,NJ:IEEE,2019:1-7.
[22]BIJELIC M,GRUBER T,MANNAN F,et al.Seeing through fog without seeing fog:Deep multimodal sensor fusion in unseen adverse weather[C]//Proceedings of the 34th IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE,2020:11679-11689.
[23]TAO H Z,YI N S,JUN J C,et al.Behind the curtain:Learning occluded shapes for 3D object detection[C]//Proceedings of the 36th AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI,2022:11910-11919.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于BEV占位预测的激光-毫米波雷达融合目标检测算法

LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0