Computer Science ›› 2024, Vol. 51 ›› Issue (5): 162-171.doi: 10.11896/jsjkx.230300113

• Computer Graphics & Multimedia • Previous Articles     Next Articles

3D Object Detection Based on Edge Convolution and Bottleneck Attention Module for Point Cloud

JIAN Yingjie, YANG Wenxia, FANG Xi, HAN Huan   

  1. School of Science,Wuhan University of Technology,Wuhan 430070,China
  • Received:2023-03-13 Revised:2023-10-21 Online:2024-05-15 Published:2024-05-08
  • About author:JIAN Yingjie,born in 1998,postgra-duate.His main research interests include 2D and 3D object detection and so on.
    YANG Wenxia,born in 1978,Ph.D,associate professor.Her main research interests include image and video proces-sing and so on.
  • Supported by:
    National Key R & D Program of China(2020YFA0714200) and National Natural Science Foundation of China(11901443).

Abstract: Due to the highly sparsity of point cloud data,current 3D object detection methods based on point cloud are inadequate for learning local features,and some invalid information contained in point cloud data can interfere with object detection.To address the above problems,a 3D object detection model based on edge convolution(EdgeConv) and bottleneck attention module(BAM) is proposed.First,by creating a K-nearest-neighbor graph structure for each point in point clouds on the feature space,multilayer edge convolutions are constructed to learn the multi-scale local features of point clouds.Second,a bottleneck attention module(BAM) is designed for 3D point cloud data,and each BAM consists of a channel attention module and a spatial attention module to enhance the point cloud information that is valuable for object detection,aiming to strengthen the feature representation of the proposed model.The network uses VoteNet as the baseline,and multilayer edge convolutions and BAM are added sequentially between PointNet++ and the voting module.The proposed model is evaluated and compared with other 13 state-of-the-art methods on two benchmark datasets SUN RGB-D and ScanNetV2.Experimental results demonstrate that on SUN RGB-D dataset,the proposed model achieves the highest mAP@0.5,and the highest AP@0.25 for six out of ten categories such as bed,chair and desk.On ScanNetV2 dataset,this model outperforms other 13 methods in terms of mAP under both IoU 0.25 and 0.5,and achieves the highest AP@0.25 for ten out of eighteen categories such as chair,sofa and picture.As compared to the baseline VoteNet,the mAP@0.25 of the proposed model improves by 6.5% and 12.9% respectively on two datasets.Ablation studies are conducted to verify the contributions of each component.

Key words: 3D object detection, Point clouds, Edge convolution, Bottleneck attention module, VoteNet, SUN RGB-D dataset, ScanNetV2 dataset

CLC Number: 

  • TP183
[1]CHE A B,ZHANG H,LI C,et al.Single-stage 3D Object Detection Method Based on Point Cloud Data in Traffic Environment[J].Computer Science,2022,49(S2):567-572.
[2]SHEN Q,CHEN Y L,LIU S,et al.A Two-level Network-based Algorithm for 3D Object Detection[J].Computer Science,2020,47(10):145-150.
[3]GUO Y F,WU D H,WEI Q M.A Review of Point Cloud-based 3D Object Detection Methods Based on Deep Learning[J].Computer Application Research,2023,40(1):20-27.
[4]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:652-660.
[5]QI C R,SU H,NIEßNER M,et al.Volumetric and Multi-View CNNs for Object Classification on 3D Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2016:5648-5656.
[6]SU H,MAJI S,KALOGERAKIS E,et al.Multi-view Convolutional Neural Networks for 3D Shape Recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press 2015:945-953.
[7]ZHOU Y,TUZEL O.VoxelNet:End-to-End Learning for Point Cloud Based 3D Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2018:4490-4499.
[8]WU Z,SONG S,KHOSLA A,et al.3D ShapeNets:A Deep Representation for Volumetric Shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:1912-1920.
[9]QI C R,YI L,SU H,et al.PointNet++:Deep HierarchicalFeature Learning on Point Sets in a Metric Space[C]//Confe-rence and Workshop on Neural Information Processing Systems.Cambridge:MIT Press,2017:5099-5108.
[10]SHI S,WANG X,LI H.PointRCNN:3D Object Proposal Ge-neration and Detection from Point Cloud[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2019:770-779.
[11]QI C R,LITANY O,HE K,et al.Deep Hough Voting for 3D Object Detection in Point Clouds[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2019:9277-9286.
[12]QI C R,CHEN X,LITANY O,et al.ImVoteNet:Boosting 3D Object Detection in Point Clouds with Image Votes[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:4404-4413.
[13]CHENG B,SHENG L,SHI S,et al.Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2021:8963-8972.
[14]ZHANG Z,SUN B,YANG H,et al.H3DNet:3D Object Detection Using Hybrid Geometric Primitives[C]//Computer Vision-ECCV 2020:16th European Conference.Berlin:Springer Press,2020:311-329.
[15]WANG H,SHI S,YANG Z,et al.RBGNet:Ray-based Grouping for 3D Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2022:1110-1119.
[16]LIU Z,ZHANG Z,CAO Y,et al.Group-Free 3D Object Detection via Transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:2949-2958.
[17]ZHENG Y,DUAN Y,LU J,et al.HyperDet3D:Learning aScene-conditioned 3D Object Detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2022:5585-5594.
[18]WANG Y,SUN Y,LIU Z,et al.Dynamic Graph CNN forLearning on Point Clouds[J].ACM Transactions on Graphics(ToG),2019,38(5):1-12.
[19]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All You Need[C]//Conference and Workshop on Neural Information Processing Systems.Cambridge:MIT Press,2017:5998-6008.
[20]HU J,SHEN L,SUN G.Squeeze-and-Excitation Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2018:7132-7141.
[21]QIN Z,ZHANG P,WU F,et al.FcaNet:Frequency Channel Attention Networks[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:783-792.
[22]JADERBERG M,SIMONYAN K,Zisserman A.Spatial Transformer Networks[C]//Conference and Workshop on Neural Information Processing Systems.Cambridge:MIT Press,2015:2017-2025.
[23]CHU X,TIAN Z,WANG Y,et al.Twins:Revisiting the Design of Spatial Attention in Vision Transformers[C]//Conference and Workshop on Neural Information Processing Systems.Cambridge:MIT Press 2021:9355-9366.
[24]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision(ECCV).Berlin:Springer Press,2018:3-19.
[25]PARK J,WOO S,LEE J Y,et al.BAM:Bottleneck AttentionModule[C]//British Machine Vision Conference 2018.Newcastle:BMVA Press,2018:147-161.
[26]SONG S,LICHTENBERG S P,XIAO J.SUN RGB-D:A RGBD Scene Understanding Benchmark Suite[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:567-576.
[27]DAI A,CHANG A X,SAVVA M,et al.ScanNet:Richly-annotated 3D Reconstructions of Indoor Scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:5828-5839.
[28]QI C R,LIU W,WU C,et al.Frustum PointNets for 3D Object Detection from RGB-D Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2018:918-927.
[29]MISRA I,GIRDHAR R,JOULIN A.An End-to-End Trans-former Model for 3D Object Detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:2906-2917.
[30]XIE Q,LAI Y K,WU J,et al.VENet:Voting EnhancementNetwork for 3D Object Detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:3712-3721.
[31]WANG Y,CHEN X,CAO L,et al.Multimodal Token Fusionfor Vision Transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2022:12186-12195.
[32]XIE Q,LAI Y K,WU J,et al.MLCVNet:Multi-Level Context VoteNet for 3D Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:10447-10456.
[33]PAN X,XIA Z,SONG S,et al.3D Object Detection with Pointformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2021:7463-7472.
[34]TAO B,YAN F W,YIN Z S,et al.3D Object Detection Based on High-precision Map Enhancement[J].Journal of Jilin University(Engineering and Technology Edition),2023,53(3):802-809.
[1] LI Xiang, FAN Zhiguang, LIN Nan, CAO Yangjie, LI Xuexiang. Self-supervised Learning for 3D Real-scenes Question Answering [J]. Computer Science, 2023, 50(9): 220-226.
[2] HUO Weile, JING Tao, REN Shuang. Review of 3D Object Detection for Autonomous Driving [J]. Computer Science, 2023, 50(7): 107-118.
[3] MIAO Yongwei, SHAN Feng, DU Sicheng, WANG Jinrong, ZHANG Xudong. Object Region Guided 3D Target Detection in RGB-D Scenes [J]. Computer Science, 2023, 50(11A): 221200152-8.
[4] LI Zong-min, ZHANG Yu-peng, LIU Yu-jie, LI Hua. Deformable Graph Convolutional Networks Based Point Cloud Representation Learning [J]. Computer Science, 2022, 49(8): 273-278.
[5] CHE Ai-bo, ZHANG Hui, LI Chen, WANG Yao-nan. Single-stage 3D Object Detector in Traffic Environment Based on Point Cloud Data [J]. Computer Science, 2022, 49(11A): 210900079-6.
[6] SHEN Qi, CHEN Yi-lun, LIU Shu, LIU Li-gang. 3D Object Detection Algorithm Based on Two-stage Network [J]. Computer Science, 2020, 47(10): 145-150.
[7] HAO Wen, WANG Ying-hui, NING Xiao-juan, LIANG Wei and SHI Zheng-hao. Survey of 3D Object Recognition for Point Clouds [J]. Computer Science, 2017, 44(9): 11-16.
[8] QIU Chun-li and XU Hong-li. Direct Triangulation Algorithm for Three-dimensional Scattered Points [J]. Computer Science, 2014, 41(2): 157-160.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!