Computer Science ›› 2026, Vol. 53 ›› Issue (6): 214-231.doi: 10.11896/jsjkx.250400111

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Review of 3D Object Detection Based on LiDAR-camera Fusion

JI Wenyu1, LI Yang1, WANG Jiabao1, FU Ruizhi2, LIU Xiaoyu1, MIAO Zhuang1   

  1. 1 College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China
    2 Unit 32316 of PLA,Urumqi 830000,China
  • Received:2025-04-23 Revised:2025-07-14 Online:2026-06-15 Published:2026-06-09
  • About author:JI Wenyu,born in 2002,postgraduate.His main research interests include deep learning and computer vision.
    MIAO Zhuang,born in 1976,Ph.D,professor,is a member of CCF(No.E20031151).His main research in-terests include artificial intelligence,pattern recognition and computer vision.
  • Supported by:
    National Natural Science Foundation of China(62273356),High-level Talents Innovation Project(KYZYJQJY2101),National Talent Project of China(2022-JCJQ-ZQ-001) and Provincial Primary Research & Development Plan of Jiangsu (BE2023809).

Abstract: Multi-modal 3D object detection,as a fundamental task in autonomous driving and human-robot collaboration,has garnered significant attention in recent years.By integrating LiDAR and camera data,3D object detection facilitates effective information transmission and feature consolidation,thereby enhancing the understanding of complex environments and improving detection accuracy.However,as the number of fusion methodologies increases,traditional methods designed for dense scenarios encounter problems such as high computational costs and limited detection ranges,making them insufficient to meet the real-world requirements for long-range detection.Consequently,emerging methods are increasingly focused on developing novel fusion architectures to address detection challenges in sparse scenarios.This paper represents the first Chinese-language review to classify multi-modal 3D object detection methods from the novel perspective of dense scenarios versus sparse scenarios,providing a comprehensive analysis of the characteristics of existing literature and summarizing the evolutionary trends in this field.The primary contributions of this work are as follows:1)A novel classification framework is proposed,distinguishing between dense-level and sparse-level fusion based on the incorporation of dense Bird's-Eye-View feature maps,encompassing two major categories and five subcategories,while elucidating the characteristics and application scenarios of different fusion strategies;2)A detailed review of 15 publicly available datasets is conducted,with a focus on evaluation metrics for 3 mainstream datasets,and a comparative analysis of experimental results from over 10 distinct methods on these benchmarks is provided;3)The limitations of representative fusion methods are critically examined,and future research directions are outlined,offering insights into potential advancements in the field.

Key words: LiDAR, Camera images, Point cloud, 3D object detection, Sensor fusion

CLC Number: 

  • TP391
[1]SONG H,CHO J,HA J,et al.Panoptic-FusionNet:Camera-LiDAR fusion-based point cloud panoptic segmentation for autonomous driving[J].Expert Systems with Applications,2024,251:123950.
[2]FAN L,WANG F,WANG N,et al.Fully sparse 3D object detection[J].Advances in Neural Information Processing Systems,2022,35:351-363.
[3]FU T,XIE S,HU W,et al.LiDAR-camera fusion:Dual-scale correction for vehicle multi-object detection and trajectory extraction[J/OL].https://doi.org/10.1080/15472450.2024.2416164.
[4]YAO J,ZHOU J,WANG Y,et al.Infrastructure-assisted 3D detection networks based on camera-LiDAR early fusion strategy[J].Neurocomputing,2024,600:128180.
[5]CHU H,LIU H,ZHUO J,et al.Occlusion-guided multi-modal fusion for vehicle-infrastructure cooperative 3D object detection[J].Pattern Recognition,2025,157:110939.
[6]HASAN M,HANAWA J,GOTO R,et al.LiDAR-based detection,tracking,and property estimation:A contemporary review[J].Neurocomputing,2022,506:393-405.
[7]WANG Z,LI P,ZHANG Q,et al.A LiDAR-depth camera information fusion method for human robot collaboration environment[J].Information Fusion,2025,114:102717.
[8]LI Y,WANG Y,XIE J,et al.Target detection on water surfaces using fusion of camera and LiDAR based information[J].Computers,Materials & Continua,2024,80(1).
[9]OBRADOVIĆ J,FABIJANIĆ M,BATOŠ M,et al.Analysis ofLiDAR-camera fusion for marine situational awareness with emphasis on cluster selection in camera frustum[J].IFAC-Papers OnLine,2024,58(20):434-439.
[10]XIE Y,NANLAL C,LIU Y.Reliable LiDAR-based ship detection and tracking for autonomous surface vehicles in busy maritime environments[J].Ocean Engineering,2024,312:119288.
[11]LANG A H,VORA S,CAESAR H,et al.PointPillars:Fast encoders for object detection from point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12697-12705.
[12]TANG Q,BAI X,GUO J,et al.DFAF3D:A dual-feature-aware anchor-free single-stage 3D detector for point clouds[J].Image and Vision Computing,2023,129:104594.
[13]BHATTACHARYYA P,HUANG C,CZARNECKI K.SA-Det3D:Self-attention based context-aware 3D object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3022-3031.
[14]CHEN X,KUNDU K,ZHANG Z,et al.Monocular 3D object detection for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2016:2147-2156.
[15]CHEN W,LI P,ZHAO H.MSL3D:3D object detection from monocular,stereo and point cloud for autonomous driving[J].Neurocomputing,2022,494:23-32.
[16]BERTONI L,KREISS S,ALAHI A.Monoloco:Monocular 3Dpedestrian localization and uncertainty estimation[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2019:6861-6871.
[17]CHEN Q,ZHOU M,YU H.MIDFA:Memory-based instancedivision and feature aggregation network for video object detection[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining.Cham:Springer Nature Switzerland,2023:153-164.
[18]WANG K,ZHOU T,ZHANG Z,et al.PVF-DectNet:Multi-modal 3D detection network based on perspective-voxel fusion[J].Engineering Applications of Artificial Intelligence,2023,120:105951.
[19]LIU Y,WANG T,ZHANG X,et al.PETR:Position embedding transformation for multi-view 3D object detection[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:531-548.
[20]WANG L,ZHANG X,SONG Z,et al.Multi-modal 3D object detection in autonomous driving:A survey and taxonomy[J].IEEE Transactions on Intelligent Vehicles,2023,8(7):3781-3798.
[21]MAO J,SHI S,WANG X,et al.3D object detection for autonomous driving:A comprehensive survey[J].International Journal of Computer Vision,2023,131(8):1909-1963.
[22]YOO J H,KIM Y,KIM J,et al.3D-CVF:Generating joint camera and LiDAR features using cross-view spatial feature fusion for 3D object detection[C]//Computer Vision-ECCV 2020.Springer International Publishing,2020:720-736.
[23]QI C R,LIU W,WU C,et al.Frustum PointNets for 3D object detection from RGB-D data[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018:918-927.
[24]LI Y,CHEN Y,QI X,et al.Unifying voxel-based representation with transformer for 3D object detection[J].Advances in Neural Information Processing Systems,2022,35:18442-18455.
[25]FERNANDES D,SILVA A,NÉVOA R,et al.Point-cloud based 3D object detection and classification methods for self-driving applications:A survey and taxonomy[J].Information Fusion,2021,68:161-191.
[26]ZAMANAKOS G,TSOCHATZIDIS L,AMANATIADIS A,et al.A comprehensive survey of LiDAR-based 3D object detection methods with deep learning for autonomous driving[J].Computers & Graphics,2021,99:153-181.
[27]WANG Z,WU Y,NIU Q.Multi-sensor fusion in automateddriving:A survey[J].IEEE Access,2019,8:2847-2868.
[28]MA Y,WANG T,BAI X,et al.Vision-centric BEV perception:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024(12):10978-10997.
[29]LIANG W,XU P,GUO L,et al.A survey of 3D object detection[J].Multimedia Tools and Applications,2021,80(19):29617-29641.
[30]ARNOLD E,AL-JARRAH O Y,DIANATI M,et al.A survey on 3D object detection methods for autonomous driving applications[J].IEEE Transactions on Intelligent Transportation Systems,2019,20(10):3782-3795.
[31]FENG D,HAASE-SCHÜTZ C,ROSENBAUM L,et al.Deepmulti-modal object detection and semantic segmentation for autonomous driving:Datasets,methods,and challenges[J].IEEE Transactions on Intelligent Transportation Systems,2020,22(3):1341-1360.
[32]MAO J,SHI S,WANG X,et al.3D object detection for autonomous driving:A review and new outlooks[EB/OL].(2022-06-19) [2025-04-10].https://arxiv.org/pdf/2206.09474.pdf.
[33]HUANG K,SHI B,LI X,et al.Multi-modal sensor fusion for auto driving perception:A survey[EB/OL].(2022-02-06) [2025-04-10].https://arxiv.org/pdf/2202.02703.pdf.
[34]TANG Y,HE H,WANG Y,et al.Multi-modality 3D object detection in autonomous driving:A review[J].Neurocomputing,2023(c):126587.
[35]LIU Z,HUANG T,LI B,et al.EPNet++:Cascade bi-directional fusion for multi-modal 3D object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(7):8324-8341.
[36]LI Y,FAN L,LIU Y,et al.Fully sparse fusion for 3D object detection[J].arXiv.2304.12310,2023.
[37]LIANG T,XIE H,YU K,et al.BEVFusion:A simple and robust LiDAR-camera fusion framework[J].Advances in Neural Information Processing Systems,2022,35:10421-10434.
[38]LIU Z,TANG H,AMINI A,et al.BEVFusion:Multi-taskmulti-sensor fusion with unified Bird's-Eye-View representation[C]//2023 IEEE International Conference on Robotics and Automation.IEEE,2023:2774-2781.
[39]CAI H,ZHANG Z,ZHOU Z,et al.BEVFusion4D:Learning LiDAR-camera fusion under Bird's-Eye-View via cross-modality guidance and temporal aggregation[EB/OL].(2023-03-30) [2025-04-10].https://arxiv.org/pdf/2303.17099.pdf.
[40]WEN L H,JO K H.Deep learning-based perception systems for autonomous driving:A comprehensive survey[J].Neurocomputing,2022,489:255-270.
[41]LIU R,WANG X,WANG W,et al.Bird's-Eye-View scenegraph for vision-language navigation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:10968-10980.
[42]REN S.Faster R-CNN:Towards real-time object detection with region proposal networks[EB/OL].(2015-06-04) [2025-04-10].https://arxiv.org/pdf/1506.01497.
[43]HE K,GKIOXARI G,DOLLÁR P,et al.Mask R-CNN[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2017:2961-2969.
[44]NIE X,ZHU L,HE Z,et al.Investigating 3D object detectionusing stereo camera and LiDAR fusion with Bird's-Eye-View representation[J].Neurocomputing,2024(c):129144.
[45]YU F,LIAN J,LI L,et al.Cascade fusion of multi-modal and multi-source feature fusion by the attention for three-dimensional object detection[J].Engineering Applications of Artificial Intelligence,2024,133:108124.
[46]HUANG L,LI Z,SIMA C,et al.Leveraging vision-centricmulti-modal expertise for 3D object detection[J].Advances in Neural Information Processing Systems,2023,36:38504-38519.
[47]CHEN Z,LI Z,ZHANG S,et al.AutoAlign:Pixel-instance feature aggregation for multi-modal 3D object detection[EB/OL].(2022-01-17) [2025-04-10].https://arxiv.org/pdf/2201.06493.pdf.
[48]CHEN Z,LI Z,ZHANG S,et al.Deformable feature aggregation for dynamic multi-modal 3D object detection[C]//European Conference on Computer Vision.Cham:Springer Nature Swi-tzerland,2022:628-644.
[49]LIANG M,YANG B,CHEN Y,et al.Multi-task multi-sensor fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7345-7353.
[50]SONG Z,YANG L,XU S,et al.GraphBEV:Towards robust BEV feature alignment for multi-modal 3D object detection[C]//European Conference on Computer Vision.Cham:Springer,2025:347-366.
[51]BAI X,HU Z,ZHU X,et al.TransFusion:Robust LiDAR-camera fusion for 3D object detection with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:1090-1099.
[52]CHEN X,ZHANG T,WANG Y,et al.FUTR3D:A unified sensor fusion framework for 3D detection[C]//proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:172-181.
[53]YAN J,LIU Y,SUN J,et al.Cross modal transformer:Towards fast and robust 3D object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:18268-18278.
[54]WANG H,TANG H,SHI S,et al.UniTR:A unified and efficient multi-modal transformer for Bird's-Eye-View representation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:6792-6802.
[55]JIANG H,LU Y,ZHANG D,et al.Deep learning-based fusion networks with high-order attention mechanism for 3D object detection in autonomous driving scenarios[J].Applied Soft Computing,2024,152:111253.
[56]XIE Y,XU C,RAKOTOSAONA M J,et al.SparseFusion:Fusing multi-modal sparse representations for multi-sensor 3D object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:17591-17602.
[57]DENG J,CZARNECKI K.MLOD:A multi-view 3D object detection based on robust feature fusion method[C]//2019 IEEE Intelligent Transportation Systems Conference.IEEE,2019:279-284.
[58]HONG Y,DAI H,DING Y.Cross-modality knowledge distillation network for monocular 3D object detection[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:87-104.
[59]YIN J,FANG J,ZHOU D,et al.Semi-supervised 3D object detection with proficient teachers[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:727-743.
[60]DAI D,WANG J,CHEN Z,et al.Image guidance based 3D vehicle detection in traffic scene[J].Neurocomputing,2021,428:1-11.
[61]ZHU X,SU W,LU L,et al.Deformable DETR:Deformabletransformers for end-to-end object detection[EB/OL].(2020-10-08) [2025-04-10].https://scispace.com/pdf/deformable-detr-deformable-transformers-for-end-to-end-105syp985u.pdf.
[62]CHEN X,MA H,WAN J,et al.Multi-view 3D object detection network for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2017:1907-1915.
[63]LIANG M,YANG B,WANG S,et al.Deep continuous fusion for multi-sensor 3D object detection[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2018:641-656.
[64]KU J,MOZIFIAN M,LEE J,et al.Joint 3D proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2018:1-8.
[65]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[66]SIMON M,AMENDE K,KRAUS A,et al.Complexer-yolo:Real-time 3D object detection and tracking on semantic point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019.
[67]HU C,ZHENG H,LI K,et al.FusionFormer:A multi-sensory fusion in Bird's-Eye-View and temporal consistent transformer for 3D objection[EB/OL].(2023-09-11) [2025-04-10].https://arxiv.org/pdf/2309.05257.pdf.
[68]GE C,CHEN J,XIE E,et al.MetaBEV:Solving sensor failures for BEV detection and map segmentation[EB/OL].(2023-04-20) [2025-04-10].https://arxiv.org/pdf/2304.09801.
[69]SONG Z,ZHANG G,LIU L,et al.RoboFusion:Towards robust multi-modal 3D object detection via sam[EB/OL].(2024-04-30) [2025-04-10].https://arxiv.org/pdf/2401.03907.pdf.
[70]GUNN J,LENYK Z,SHARMA A,et al.Lift-Attend-Splat:Bird's-Eye-View camera-LiDAR fusion using transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:4526-4536.
[71]LE D T,SHI H,CAI J,et al.Diffusion model for robust multi-sensor fusion in 3D object detection and BEV segmentation[C]//European Conference on Computer Vision.Cham:Springer,2025:232-249.
[72]FU J,GAO C,WANG Z,et al.Eliminating cross-modal conflicts in BEV space for LiDAR-camera 3D object detection[EB/OL].(2024-03-12) [2025-04-10].https://arxiv.org/pdf/2403.07372.
[73]LI X,FAN B,TIAN J,et al.GAFusion:Adaptive fusing LiDAR and camera with multiple guidance for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:21209-21218.
[74]LI J,LI R,LI J,et al.Dual-view 3D object recognition and detection via LiDAR point cloud and camera image[J].Robotics and Autonomous Systems,2022,150:103999.
[75]MOHAN R,CATTANEO D,DREWS F,et al.Progressive-multi-modal fusion for robust 3D object detection[C]//8th Annual Conference on Robot Learning.2024.
[76]YANG Z,CHEN J,MIAO Z,et al.Deepinteraction:3D object detection via modality interaction[J].Advances in Neural Information Processing Systems,2022,35:1992-2005.
[77]CHEN Z,LI Z,ZHANG S,et al.BEVDistill:Cross-modal BEV distillation for multi-view 3D object detection[EB/OL].(2022-07-11) [2025-04-10].https://arxiv.org/pdf/2211.09386.pdf.
[78]WEI M,LI J,KANG H,et al.BEV-CFKT:A LiDAR-camera cross-modality-interaction fusion and knowledge transfer framework with transformer for BEV 3D object detection[J].Neurocomputing,2024,582:127527.
[79]LI Z,WANG W,LI H,et al.BEVFormer:Learning Bird's-Eye-View representation from multi-camera images via spatiotemporal transformers[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:1-18.
[80]PIERGIOVANNI A J,CASSER V,RYOO M S,et al.4D-Net for learned multi-modal alignment[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:15435-15445.
[81]LU H,CHEN X,ZHANG G,et al.SCANet:Spatial-channel attention network for 3D object detection[C]//ICASSP 2019-2019 IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2019:1992-1996.
[82]ZHANG W,SHI H,ZHAO Y,et al.MMAF-Net:Multi-view multi-stage adaptive fusion for multi-sensor 3D object detection[J].Expert Systems with Applications,2024,242:122716.
[83]ZHAO Y,GONG Z,ZHENG P,et al.SimpleBEV:Improved LiDAR-camera fusion architecture for 3D object detection[EB/OL].(2024-11-08) [2025-04-10].https://arxiv.org/pdf/2411.05292.pdf.
[84]YIN Z,SUN H,LIU N,et al.FGFusion:Fine-grained LiDAR-camera fusion for 3D object detection[C]//Chinese Conference on Pattern Recognition and Computer Vision.Singapore:Springer Nature Singapore,2023:505-517.
[85]LI W L,YU F,SHI X H,et al.Research on target detection algorithm of fusion between LiDAR and monocular camera under BEV characteristics[J].Computer Engineering and Applications,2024,60(11):182- 193.
[86]MAI N A M,DUTHON P,KHOUDOUR L,et al.Sparse LiDAR and stereo fusion(SLS-Fusion) for depth estimation and 3D object detection[C]//11th International Conference ofPattern Recognition Systems.2021:150-156.
[87]HE Q,WANG Z,ZENG H,et al.Stereo RGB and deeper Li-DAR-based network for 3D object detection in autonomous driving[J].IEEE Transactions on Intelligent Transportation Systems,2022,24(1):152-162.
[88]YAN W,SU K,REN J,et al.Sparse LiDAR and binocular stereofusion network for 3D object detection[C]//Chinese Conference on Pattern Recognition and Computer Vision.Cham:Springer Nature Switzerland,2022:41-55.
[89]XIE L,XIANG C,YU Z,et al.PI-RCNN:An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:12460-12467.
[90]VORA S,LANG A H,HELOU B,et al.PointPainting:Sequential fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:4604-4612.
[91]WANG C,MA C,ZHU M,et al.PointAugmenting:Cross-modal augmentation for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:11794-11803.
[92]XU S,ZHOU D,FANG J,et al.FusionPainting:Multimodal fusion with adaptive attention for 3D object detection[C]//2021 IEEE International Intelligent Transportation Systems Conference.IEEE,2021:3047-3054.
[93]HUANG T,LIU Z,CHEN X,et al.EPNet:Enhancing pointfeatures with image semantics for 3D object detection[C]//Computer Vision-ECCV 2020:16th European Conference.Springer International Publishing,2020:35-52.
[94]ZHU M,MA C,JI P,et al.Cross-modality 3D object detection[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2021:3772-3781.
[95]ZHU Y,XIE J,LIU M,et al.BF3D:Bi-directional fusion 3D detector with semantic sampling and geometric mapping[J].Image and Vision Computing,2023,139:104835.
[96]LI Y,QI X,CHEN Y,et al.Voxel field fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:1120-1129.
[97]PANG S,MORRIS D,RADHA H.CLOCs:Camera-LiDAR object candidates fusion for 3D object detection[C]//2020 IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2020:10386-10393.
[98]LI Y,YU A W,MENG T,et al.DeepFusion:LiDAR-cameradeep fusion for multi-modal 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:17182-17191.
[99]WANG Z,JIA K.Frustum ConvNet:Sliding frustums to aggre-gate local point-wise features for amodal 3D object detection[C]//2019 IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2019:1742-1749.
[100]SHIN K,KWON Y P,TOMIZUKA M.RoarNet:A robust 3D object detection based on region approximation refinement[C]//2019 IEEE Intelligent Vehicles Symposium.IEEE,2019:2510-2515.
[101]ERABATI G K,ARAUJO H.SRFDet3D:Sparse region fusion based 3D object detection[J].Neurocomputing,2024,593:127814.
[102]DING Z,HU Y,GE R,et al.1st place solution for Waymo Open dataset challenge--3D detection and domain adaptation[EB/OL].(2023-06-16) [2025-04-10].https://arxiv.org/pdf/2006.15505.pdf.
[103]SINDAGI V A,ZHOU Y,TUZEL O.MVX-Net:Multimodalvoxelnet for 3D object detection[C]//2019 International Conference on Robotics and Automation.IEEE,2019:7276-7282.
[104]ZHAO X,LIU Z,HU R,et al.3D object detection using scale invariant and feature reweighting networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:9267-9274.
[105]SUN P,ZHANG R,JIANG Y,et al.Sparse R-CNN:End-to-end object detection with learnable proposals[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:14454-14463.
[106]LIN J,DIEKMANN P,FRAMING C E,et al.Maritime environment perception based on deep learning[J].IEEE Transactions on Intelligent Transportation Systems,2022,23(9):15487-15497.
[107]CAESAR H,BANKITI V,LANG A H,et al.nuScenes:A multimodal dataset for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:11621-11631.
[108]SUN P,KRETZSCHMAR H,DOTIWALLA X,et al.Scalability in perception for autonomous driving:Waymo Open dataset[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:2446-2454.
[109]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? the KITTI vision benchmark suite[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2012:3354-3361.
[110]CHANG M F,LAMBERT J,SANGKLOY P,et al.Argoverse:3D tracking and forecasting with rich maps[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:8748-8757.
[111]HUANG X,CHENG X,GENG Q,et al.The Apolloscape dataset for autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018:954-960.
[112]XIAO P,SHAO Z,HAO S,et al.PandaSet:Advanced sensor suite dataset for autonomous driving[C]//2021 IEEE International Intelligent Transportation Systems Conference.IEEE,2021:3095-3101.
[113]HOUSTON J,ZUIDHOF G,BERGAMINI L,et al.One thousand and one hours:Self-driving motion prediction dataset[C]//Conference on Robot Learning.PMLR,2021:409-418.
[114]PHAM Q H,SEVESTRE P,PAHWA R S,et al.A* 3D dataset:Towards autonomous driving in challenging environments[C]//2020 IEEE International Conference on Robotics and Automation.IEEE,2020:2267-2273.
[115]WENG X,MAN Y,Park J,et al.All-in-one drive:A comprehensive perception dataset with high-density long-range point clouds[EB/OL].(2021-06-07) [2025-04-10].https://openreview.net/pdf?id=yl9aThYT9W.
[116]PATIL A,MALLA S,GANG H,et al.The H3D dataset for full-surround 3D multi-object detection and tracking in crowded urban scenes[C]//2019 International Conference on Robotics and Automation.IEEE,2019:9552-9557.
[117]WANG Z,DING S,LI Y,et al.Cirrus:A long-range bi-pattern LiDAR dataset[C]//2021 IEEE International Conference on Robotics and Automation.IEEE,2021:5744-5750.
[118]MAO J,NIU M,JIANG C,et al.One million scenes for autonomous driving:Once dataset[EB/OL].(2021-06-21) [2025-04-10].https://arxiv.org/pdf/2106.11037.pdf.
[119]GEYER J,KASSAHUN Y,MAHMUDI M,et al.A2D2:Audi autonomous driving dataset[EB/OL].(2020-04-14) [2025-04-10].https://arxiv.org/pdf/2004.06320.pdf.
[120]PITROPOV M,GARCIA D E,REBELLO J,et al.Canadian adverse driving conditions dataset[J].The International Journal of Robotics Research,2021,40(4/5):681-690.
[121]MARTIN-MARTIN R,PATEL M,REZATOFIGHI H,et al.JRDB:A dataset and benchmark of egocentric robot visual perception of humans in built environments[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,45(6):6748-6765.
[122]YU K,TAO T,XIE H,et al.Benchmarking the robustness of LiDAR-camera fusion for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:3188-3198.
[123]SHI S,WANG X,LI H.PointRCNN:3D object proposal generation and detection from point cloud[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:770-779.
[124]CHEN Y,LIU S,SHEN X,et al.DSGN:Deep stereo geometry network for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12536-12545.
[125]GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:The KITTI dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237.
[126]LIAO Y,XIE J,GEIGER A.KITTI-360:A novel dataset and benchmarks for urban scene understanding in 2D and 3D[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(3):3292-3310.
[127]WILSON B,QI W,AGARWAL T,et al.Argoverse 2:Next generation datasets for self-driving perception and forecasting[EB/OL].(2023-01-02) [2025-04-10].https://arxiv.org/pdf/2301.00493.pdf.
[128]RAMANISHKA V,CHEN Y T,MISU T,et al.Toward driving scene understanding:A dataset for learning driver behavior and causal reasoning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7699-7707.
[129]TA K,BRUGGEMANN D,BRÖDERMANN T,et al.L2E:Lasers to events for 6-DoF extrinsic calibration of LiDARs and event cameras[C]//2023 IEEE International Conference on Robotics and Automation.IEEE,2023:11425-11431.
[130]XIONG W,ZOU Z,ZHAO Q,et al.LXLv2:Enhanced LiDAR excluded lean 3D object detection with fusion of 4D Radar and Camera[J].IEEE Robotics and Automation Letters,2025(3):2862-2869.
[131]LI J,BAI L,YANG B,et al.Graph representation learning for infrared and visible image fusion[J].IEEE Transactions on Automation Science and Engineering,2025(22):13801-13813.
[132]CHEN J Y,ZHAO Z,XU X E,et al.Research on single shot multibox detector applied to traffic signs LiDAR[J].Journal of Chinese Computer Systems,2025,46(7):1590-1605.
[133]杜云龙,强俊,王洪铭,等.应用于交通标志的单步多目标检测方法研究[J].重庆工商大学学报(自然科学版),2025,42(1):64-71.
[1] ZENG Xinran, LI Tianrui, LI Chongshou. Active Learning for Point Cloud Semantic Segmentation Based on Dynamic Balance and DistanceSuppression [J]. Computer Science, 2025, 52(8): 180-187.
[2] YUAN Youwen, JIN Shuo, ZHAO Xi. IBSNet:A Neural Implicit Field for IBS Prediction in Single-view Scanned Point Cloud [J]. Computer Science, 2025, 52(8): 195-203.
[3] LI Yang, LIU Yi, LI Hao, ZHANG Gang, XU Mingfeng, HAO Chongqing. Human Pose Estimation Using Millimeter Wave Radar Based on Transformer and PointNet++ [J]. Computer Science, 2025, 52(6A): 240400169-9.
[4] WANG Jie, WANG Chuangye, XIE Jiucheng, GAO Hao. Animatable Head Avatar Reconstruction Algorithm Based on Region Encoding [J]. Computer Science, 2025, 52(3): 50-57.
[5] LI Zongmin, RONG Guangcai, BAI Yun, XU Chang , XIAN Shiyang. 3D Object Detection with Dynamic Weight Graph Convolution [J]. Computer Science, 2025, 52(3): 104-111.
[6] CAO Wenbo, WEI Mingyang, DUAN Xiaoyong, LIU Xueyuan. Three-dimensional Object Detection Algorithm of Road Scene Based on Attention Mechanism [J]. Computer Science, 2025, 52(11A): 241100112-7.
[7] YUE Qianwen, WANG Dongqiang, ZHANG Qiang. Point Cloud Registration Network Integrating Adaptive Optimization and Multi-dimensional Focusing [J]. Computer Science, 2025, 52(11A): 250100019-7.
[8] MAO Dongfang, JIANG Guoping. Bluetooth-PDR Multi-sensor Fusion Indoor Positioning Method Based on UKF [J]. Computer Science, 2025, 52(11A): 250100083-4.
[9] XIAO Xiao, BAI Zhengyao, LI Zekai, LIU Xuheng, DU Jiajin. Parallel Multi-scale with Attention Mechanism for Point Cloud Upsampling [J]. Computer Science, 2024, 51(8): 183-191.
[10] PU Bin, LIANG Zhengyou, SUN Yu. Monocular 3D Object Detection Based on Height-Depth Constraint and Edge Fusion [J]. Computer Science, 2024, 51(8): 192-199.
[11] HAN Bing, DENG Lixiang, ZHENG Yi, REN Shuang. Survey of 3D Point Clouds Upsampling Methods [J]. Computer Science, 2024, 51(7): 167-196.
[12] HUANG Haixin, CAI Mingqi, WANG Yuyao. Review of Point Cloud Semantic Segmentation Based on Graph Convolutional Neural Networks [J]. Computer Science, 2024, 51(6A): 230400196-7.
[13] LI Yuehao, WANG Dengjiang, JIAN Haifang, WANG Hongchang, CHENG Qinghua. LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction [J]. Computer Science, 2024, 51(6): 215-222.
[14] LI Zekai, BAI Zhengyao, XIAO Xiao, ZHANG Yihan, YOU Yilin. Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework [J]. Computer Science, 2024, 51(6): 231-238.
[15] JIAN Yingjie, YANG Wenxia, FANG Xi, HAN Huan. 3D Object Detection Based on Edge Convolution and Bottleneck Attention Module for Point Cloud [J]. Computer Science, 2024, 51(5): 162-171.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!