融合注意力机制的道路场景三维目标检测算法

doi:10.11896/jsjkx.241100112

Abstract

Abstract: With the development of deep learning and on-board LiDAR,driverless cars have increasingly high requirements for detection,which not only need to accurately detect obstacles on the road,but also have high requirements on detection speed.In the complex road scene,there are always obstacles and small volume of some targets,which make it difficult to accurately detect some targets.To solve this problem,this paper proposes an improved 3D target detection method of Pointpillars algorithm model to make it have higher accuracy while guaranteeing the detection speed.Firstly,a variety of data-enhancing operations are introduced to increase the diversity and magnitude of the dataset and reduce the overfitting phenomenon.Then,an attention matrix is added to the point column feature extraction,and the importance of each voxel is dynamically adjusted according to different voxel positions and semantic information,so that the model can focus on more useful features for target detection tasks.Finally,the channel attention mechanism(CA) and spatial attention mechanism(SA) modules are added to the backbone network of the model successively,which enhance the response of the model to useful information,suppresse the interference of unimportant features to the detection results,and thus improve the representation of target features.The experimental results show that the detection accuracy of the improved algorithm model is improved in each category and detection difficulty.

Key words: Laser radar, 3D object detection, Point cloud, Data enhancement, Attention mechanism

CLC Number:

TN958

CAO Wenbo, WEI Mingyang, DUAN Xiaoyong, LIU Xueyuan. Three-dimensional Object Detection Algorithm of Road Scene Based on Attention Mechanism[J].Computer Science, 2025, 52(11A): 241100112-7.

References

[1]LI B,ZHANG T,XIA T.Vehicle detection from 3d lidar using fully convolutional network[J].arXiv:1608.07916,2016.
[2]LIANG M,YANG B,WANG S,et al.Deep continuous fusionfor multi-sensor 3d object detection[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:641-656.
[3]QI C R,SU H,MO K,et al.Pointnet:Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:652-660.
[4]LANG A H,VORA S,CAESAR H,et al.Pointpillars:Fast encoders for object detection from point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12697-12705.
[5]YANG B,LUO W,URTASUN R.Pixor:Real-time 3d objectdetection from pointclouds[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.2018:7652-7660.
[6]ZHAO H,JIANG L,JIA J,et al.Point transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:16259-16268.
[7]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the International Conference on Neural Information Processing Systems,2017:6000-6010.
[8]WANG Y,GUIZILINI V C,ZHANG T,et al.Detr3d:3d object detection from multi-view images via3d-to-2d queries[C]//Conference on Robot Learning.PMLR,2022:180-191.
[9]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer International Publishing,2020:213-229.
[10]ZUO C,FENG S J,ZHANG X Y.Computational imaging under deep learning:present,Challenges and future[J].Acta Optica Sinica,2020,40(1):0111003.
[11]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[12]LUO J H,WU J X.A review of Fine-grained image classification based on deep convolutional features [J].Acta Automatica Sinica,2017,43(8):1306-1318.
[13]QI C R,LIU W,WU C,et al.Frustum pointnets for 3d object detection from rgb-d data[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:918-927.
[14]WAN J H,CHEN N J.Multi-learning Emotion RecognitionMethod based on context awareness and attention mechanism [J].Journal of Beijing Normal University(Natural Science Edition),2021,57(5):601-605.
[15]YIN Z,ONCEL T.VoxelNet:End-to-End Learning for PointCloud Based 3D Object Detection[C]//2018 IEEE/CVFConfe-rence on Computer Vision and Pattern Recognition:[Volume 7 of 13].IEEE,2018:4490-4499.
[16]YAN Y,MAO Y,LI B.Second:Sparsely embedded convolutional detection[J].Sensors,2018,18(10):3337.
[17]KU J,MOZIFIAN M,LEE J,et al.Joint 3d proposal generation and object detection from view aggregation[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2018:1-8.
[18]YUE Y C,CAI Y F,W D S.GridNet-3D:A Novel Real-Time 3D Object Detection Algorithm Based on Point Cloud[J].Chinese Journal of Electronics,2021,30(5):931-939.

Related Articles 15

[1]	PENG Jiao, HE Yue, SHANG Xiaoran, HU Saier, ZHANG Bo, CHANG Yongjuan, OU Zhonghong, LU Yanyan, JIANG dan, LIU Yaduo. Text-Dynamic Image Cross-modal Retrieval Algorithm Based on Progressive Prototype Matching [J]. Computer Science, 2025, 52(9): 276-281.
[2]	GAO Long, LI Yang, WANG Suge. Sentiment Classification Method Based on Stepwise Cooperative Fusion Representation [J]. Computer Science, 2025, 52(9): 313-319.
[3]	ZENG Xinran, LI Tianrui, LI Chongshou. Active Learning for Point Cloud Semantic Segmentation Based on Dynamic Balance and DistanceSuppression [J]. Computer Science, 2025, 52(8): 180-187.
[4]	YUAN Youwen, JIN Shuo, ZHAO Xi. IBSNet:A Neural Implicit Field for IBS Prediction in Single-view Scanned Point Cloud [J]. Computer Science, 2025, 52(8): 195-203.
[5]	LIU Jian, YAO Renyuan, GAO Nan, LIANG Ronghua, CHEN Peng. VSRI:Visual Semantic Relational Interactor for Image Caption [J]. Computer Science, 2025, 52(8): 222-231.
[6]	LIU Yajun, JI Qingge. Pedestrian Trajectory Prediction Based on Motion Patterns and Time-Frequency Domain Fusion [J]. Computer Science, 2025, 52(7): 92-102.
[7]	LIU Chengzhuang, ZHAI Sulan, LIU Haiqing, WANG Kunpeng. Weakly-aligned RGBT Salient Object Detection Based on Multi-modal Feature Alignment [J]. Computer Science, 2025, 52(7): 142-150.
[8]	ZHUANG Jianjun, WAN Li. SCF U²-Net:Lightweight U²-Net Improved Method for Breast Ultrasound Lesion SegmentationCombined with Fuzzy Logic [J]. Computer Science, 2025, 52(7): 161-169.
[9]	ZHENG Cheng, YANG Nan. Aspect-based Sentiment Analysis Based on Syntax,Semantics and Affective Knowledge [J]. Computer Science, 2025, 52(7): 218-225.
[10]	WANG Youkang, CHENG Chunling. Multimodal Sentiment Analysis Model Based on Cross-modal Unidirectional Weighting [J]. Computer Science, 2025, 52(7): 226-232.
[11]	KONG Yinling, WANG Zhongqing, WANG Hongling. Study on Opinion Summarization Incorporating Evaluation Object Information [J]. Computer Science, 2025, 52(7): 233-240.
[12]	LI Daicheng, LI Han, LIU Zheyu, GONG Shiheng. MacBERT Based Chinese Named Entity Recognition Fusion with Dependent Syntactic Information and Multi-view Lexical Information [J]. Computer Science, 2025, 52(6A): 240600121-8.
[13]	HUANG Bocheng, WANG Xiaolong, AN Guocheng, ZHANG Tao. Transmission Line Fault Identification Method Based on Transfer Learning and Improved YOLOv8s [J]. Computer Science, 2025, 52(6A): 240800044-8.
[14]	WU Zhihua, CHENG Jianghua, LIU Tong, CAI Yahui, CHENG Bang, PAN Lehao. Human Target Detection Algorithm for Low-quality Laser Through-window Imaging [J]. Computer Science, 2025, 52(6A): 240600069-6.
[15]	GUAN Xin, YANG Xueyong, YANG Xiaolin, MENG Xiangfu. Tumor Mutation Prediction Model of Lung Adenocarcinoma Based on Pathological [J]. Computer Science, 2025, 52(6A): 240700010-8.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Three-dimensional Object Detection Algorithm of Road Scene Based on Attention Mechanism

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0