Computer Science ›› 2023, Vol. 50 ›› Issue (11A): 221200152-8.doi: 10.11896/jsjkx.221200152

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Object Region Guided 3D Target Detection in RGB-D Scenes

MIAO Yongwei1,2, SHAN Feng2, DU Sicheng3, WANG Jinrong1, ZHANG Xudong4   

  1. 1 School of Information Science and Technology,Hangzhou Normal University,Hangzhou 311121,China
    2 School of Computer Science and Technology,Zhejiang Sci-Tech University,Hangzhou 310018,China
    3 School of Natural Sciences,King's College London,London N1C4BQ
    4 School of Information Science and Technology,Zhejiang Shuren University,Hangzhou 310015,China
  • Published:2023-11-09
  • About author:ZHANG Xudong,born in 1982,Ph.D,associate professor.His main research interests include computer graphics and computer vision.
  • Supported by:
    National Natural Science Foundation of China(61972458),Natural Science Foundation of Zhejiang Province,China(LZ23F020002)and Zhejiang Public Welfare Application Research Project(LGF22F020006).

Abstract: 3D object detection for RGB-D scenes is an important issue in the literature of computer graphics and 3d vision.To overcome the poor adaptability to complex background of RGB-D scenes and it is hard to effectively combine the object region information and intrinsic feature of sampling points,a novel object region guided 3d detection framework is proposed,which can combine the global and local features of sampling points and also eliminate the background interference.Our framework takes the RGB-D data of 3Dscenes as input.First,the 2D regions of different objects in the underlying RGB image are be extracted and roughly be classified.These 2D boundary boxes of different objects can thus be lifted to their corresponding 3D oblique cone regions,and the RGB-D data located in the cone regions can also be converted to point cloud data.Furthermore,guided by the object region information,its feature of the sampling points located in each oblique cone can be extracted,and the global and local features of the sampling points are effectively fused by feature transformation and maximum pool aggregation operation.Moreover,these fused feature can be adopted to predict the probability score which reflect its correlation between each sampling point located in the foreground or background regions.According to this probability score,the sampling points of foreground and background regions can be segmented and the masked point cloud is thus generated by dividing the background sampling points from the underlying 3D scenes.Finally,the center point of the object is generated by voting in the shielded point cloud,and suggestions and 3D target prediction are made with the aid of object area information.In addition,a corner loss is added to optimize the accuracy of the bounding box.Using the public SUN RGB-D dataset,experimental results show that our proposed framework is effectively on 3D object detection.The accuracy rate of point cloud target detection under the same evaluation index reaches 59.1% if compared with the traditional method,and the boundary boxes of 3d objects can also be accurately estimated for different areas even with strong occlusion or sparse sampling points.

Key words: 3D object detection, Foreground point cloud extraction, Point cloud segmentation, RGB-D, Regional information

CLC Number: 

  • TP391
[1]ARNOLD E,AL-JARRAH O Y,DIANATI M,et al.A survey on 3d object detection methods for autonomous driving applications [J].IEEE Transactions on Intelligent Transportation Systems,2019,20(10):3782-3795.
[2]CUI Q,SUN H,YANG F.Learning dynamic relationships for 3d human motion prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2020:6519-6527.
[3]CHENG B,SHENG L,SHI S,et al.Back-tracing representative points for voting-based 3D object detection in point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2021:8963-8972.
[4]YANG W K,YUAN X P,CHEN X F,et al.Multi feature segmentation of 3D lidar point cloud space [J].Computer Science,2022,49(8):143-149.
[5]DENG Z,LATECKI L J.Amodal detection of 3d objects:Inferring 3d bounding boxes from 2d ones in RGB-depth images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2017:5762-5770.
[6]HOU J,DAI A,NIEβNER M.3D-SIS:3D semantic instancesegmentation of RGB-D scans[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2019:4416-4425.
[7]LI J,WONG H C,LO S L,et al.Multiple object detection by a deformable part-based model and an R-CNN [J].IEEE Signal Processing Letters,2018,25(2):288-292.
[8]PENG C,MA J.Semantic segmentation using stride spatial py-ramid pooling and dual attention decoder [J].Pattern Recognition,2020,107(1):182-196.
[9]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towardsreal-time object detection with region proposal networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[10]RU C,WANG F,LI T,et al.Outline viewpoint feature histo-gram:An improved point cloud descriptor for recognition and grasping of workpieces [J].Review of Scientific Instruments,2021,92(2):1095-1101.
[11]LI Y,LI Q,HUANG Q,et al.Spatiotemporal interest point detector exploiting appearance and motion-variation information [J].Journal of Electronic Imaging,2019,28(3):348-361.
[12]DIETRICH P I,BLAICHER M,REUTER I,et al.In situ 3Dnanoprinting of free-form coupling elements for hybrid photonic integration [J].Nature Photonics,2018,12(4):241-247.
[13]AO S,GUO Y,GU S,et al.SGHs for 3D local surface description [J].IET Computer Vision,2020,14(4):154-161.
[14]WANG C,LIU Y J,XIE Q,et al.Anchor free target detection algorithm based on soft label and sample weight optimization [J].Computer Science,2022,49(8):157-164.
[15]CHEN Y,HAO Y G,WANG H Y,et al.A dynamic programming pre detection tracking algorithm based on local gradient intensity map [J].Computer Science,2022,49(8):150-156.
[16]LEE C,MOON J H.Robust lane detection and tracking for real-time applications [J].IEEE Transactions on Intelligent Transportation Systems,2018,19(12):4043-4048.
[17]DOUMA A,SENGUL G,SALEM F,et al.Applying the histogram of oriented gradients to recognize arabic letters[C]//IEEE 1st International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering MI-STA.IEEE,2021:350-355.
[18]LI G,YU Y.Contrast-oriented deep neural networks for salient object detection [J].IEEE Transactions on Neural Networks & Learning Systems,2018,29(12):6038-6051.
[19]CHEN M,YU L,ZHI C,et al.Improved faster R-CNN for fabric defect detection based on Gabor filter with genetic algorithm optimization [J].Computers in Industry,2022,134(1):207-214.
[20]LUGO G,HAJARI N,REDDY A,et al.Textureless object recognition using an RGB-D sensor[C]//Proceedings of International Conference on Smart Multimedia.Cham:Springer,2019:13-27.
[21]LI F,JIN W,FAN C,et al.PSANet:Pyramid splitting and aggregation network for 3d object detection in point cloud [J].Sensors,2020,21(1):136-149.
[22]YAN D,LI G,LI X,et al.An improved faster R-CNN method to detect tailings ponds from high-resolution remote sensing images [J].Remote Sensing,2021,13(11):2052-2063.
[23]QI C R,LITANY O,HE K,et al.Deep hough voting for 3d object detection in point clouds[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Computer Society Press,2019:9277-9286.
[24]CHENG B,SHENG L,SHI S,et al.Back-tracing representative points for voting-based 3d object detection in point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:8963-8972.
[25]QI C R,LIU W,WU C,et al.Frustum pointnets for 3d object detection from rgbd data[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Rrecognition.2018:918-927.
[26]WANG Z,JIA K.Frustum convnet:Sliding frustums to aggre-gate local point-wise features for amodal 3d object detection[C]//2019 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2019:1742-1749.
[27]SONG S R,LICHTENBERG S P,XIAO J X.SUN RGB-D:a rgb-d scene understanding benchmark suite[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2015:567-576.
[28]JADERBERG M,SIMONYAN K,ZISSERMAN A.Spatialtransformer networks[C]//Proceedings of Advances in Neural Information Processing Systems.2015:2017-2025.
[29]KOSSAIFI J,BULAT A,TZIMIROPOULOS G,et al.T-Net:Parametrizing fully convolutional nets with a single high-order tensor[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2019:7822-7831.
[30]QI C R,SU H,MO K,et al.PointNet:deep learning on point set for 3d classification and segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Computer Society Press,2017:77-85.
[31]KENDALL A,GAL Y,CIPOLLA R.Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2018:7482-7491.
[32]SONG S,XIAO J.Deep sliding shapes for amodal 3d object detection in rgb-d images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:808-816.
[33]REN Z,SUDDERTH E B.Three-dimensional object detectionand layout prediction using clouds of oriented gradients[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:1525-1533.
[34]LAHOUD J,GHANEM B.2d-driven 3d object detection in rgb-d images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Computer Society Press,2017:4622-4630.
[35]SHEN X,STAMOS I.Frustum VoxNet for 3D object detection from RGB-D or Depth images[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2020:1698-1706.
[36]ZHANG Z,SUN B,YANG H,et al.H3DNet:3d object detection using hybrid geometric primitives[C]//Proceedings of European Conference on Computer Vision.Cham:Springer,2020:311-329.
[1] HUO Weile, JING Tao, REN Shuang. Review of 3D Object Detection for Autonomous Driving [J]. Computer Science, 2023, 50(7): 107-118.
[2] LIU Jiawei, DU Xin, FAN Fangzhao, XIE Chengbi. Design of Indoor Mapping and Navigation System Based on Multi-sensor [J]. Computer Science, 2023, 50(6A): 220300218-8.
[3] CHE Ai-bo, ZHANG Hui, LI Chen, WANG Yao-nan. Single-stage 3D Object Detector in Traffic Environment Based on Point Cloud Data [J]. Computer Science, 2022, 49(11A): 210900079-6.
[4] YAO Nan, ZHANG Zheng. Scar Area Calculation Based on 3D Image [J]. Computer Science, 2021, 48(11A): 308-313.
[5] SHEN Qi, CHEN Yi-lun, LIU Shu, LIU Li-gang. 3D Object Detection Algorithm Based on Two-stage Network [J]. Computer Science, 2020, 47(10): 145-150.
[6] LIU Zhen-yu, GUAN Tong. Head Posture Detection Based on RGB-D Image [J]. Computer Science, 2019, 46(11A): 334-340.
[7] LU Yong-huang and HUANG Shan. 3D Point Cloud Segmentation Method Based on Adaptive Angle [J]. Computer Science, 2017, 44(Z11): 166-168.
[8] LIU Xin-chen,FU Hui-yuan and MA Hua-dong. Real-time Fingertip Tracking and Gesture Recognition Using RGB-D Camera [J]. Computer Science, 2014, 41(10): 50-52.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!