计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 221200152-8.doi: 10.11896/jsjkx.221200152
缪永伟1,2, 单丰2, 杜思澄3, 王金荣1, 张旭东4
MIAO Yongwei1,2, SHAN Feng2, DU Sicheng3, WANG Jinrong1, ZHANG Xudong4
摘要: 针对室内场景RGB-D数据的3D目标检测是图形学与三维视觉中的重要问题。针对RGB-D场景中3D目标检测对复杂背景的适应性较差、目标检测中难以有效利用物体区域信息及场景点云特征信息等缺陷,基于物体区域信息引导,提出一种融合全局和局部点云特征并排除背景干扰的3D目标检测框架。该框架以场景RGB-D数据作为输入,首先提取彩色图像中待检测目标对象2D区域并为对象进行粗分类,再将对象区域二维边界框提升到三维斜锥体区域并转化形成点云数据;然后在斜锥体点云上利用物体区域分类信息进行特征提取,并利用特征变换与最大池聚合操作将点云全局特征和局部特征有效融合;接着利用融合特征以预测各采样点与前景背景相关程度的概率分数,依据此概率分数分割场景前景点与背景点,并通过场景背景点剔除以形成屏蔽性点云;最终在屏蔽性点云中投票产生物体中心点并借助物体区域信息提出建议及3D目标预测,此外,还加入了一个角点损失,对边界框精度进行优化。针对SUN RGB-D数据集进行网络训练,实验结果表明,与传统方法相比,所提框架的目标检测结果准确率得到有效提升,同一评估指标下的点云目标检测准确率达到59.1%,并且在强遮挡或稀疏采样点区域下亦能够精确估计三维物体的边界框。
中图分类号:
[1]ARNOLD E,AL-JARRAH O Y,DIANATI M,et al.A survey on 3d object detection methods for autonomous driving applications [J].IEEE Transactions on Intelligent Transportation Systems,2019,20(10):3782-3795. [2]CUI Q,SUN H,YANG F.Learning dynamic relationships for 3d human motion prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2020:6519-6527. [3]CHENG B,SHENG L,SHI S,et al.Back-tracing representative points for voting-based 3D object detection in point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2021:8963-8972. [4]YANG W K,YUAN X P,CHEN X F,et al.Multi feature segmentation of 3D lidar point cloud space [J].Computer Science,2022,49(8):143-149. [5]DENG Z,LATECKI L J.Amodal detection of 3d objects:Inferring 3d bounding boxes from 2d ones in RGB-depth images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2017:5762-5770. [6]HOU J,DAI A,NIEβNER M.3D-SIS:3D semantic instancesegmentation of RGB-D scans[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2019:4416-4425. [7]LI J,WONG H C,LO S L,et al.Multiple object detection by a deformable part-based model and an R-CNN [J].IEEE Signal Processing Letters,2018,25(2):288-292. [8]PENG C,MA J.Semantic segmentation using stride spatial py-ramid pooling and dual attention decoder [J].Pattern Recognition,2020,107(1):182-196. [9]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towardsreal-time object detection with region proposal networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [10]RU C,WANG F,LI T,et al.Outline viewpoint feature histo-gram:An improved point cloud descriptor for recognition and grasping of workpieces [J].Review of Scientific Instruments,2021,92(2):1095-1101. [11]LI Y,LI Q,HUANG Q,et al.Spatiotemporal interest point detector exploiting appearance and motion-variation information [J].Journal of Electronic Imaging,2019,28(3):348-361. [12]DIETRICH P I,BLAICHER M,REUTER I,et al.In situ 3Dnanoprinting of free-form coupling elements for hybrid photonic integration [J].Nature Photonics,2018,12(4):241-247. [13]AO S,GUO Y,GU S,et al.SGHs for 3D local surface description [J].IET Computer Vision,2020,14(4):154-161. [14]WANG C,LIU Y J,XIE Q,et al.Anchor free target detection algorithm based on soft label and sample weight optimization [J].Computer Science,2022,49(8):157-164. [15]CHEN Y,HAO Y G,WANG H Y,et al.A dynamic programming pre detection tracking algorithm based on local gradient intensity map [J].Computer Science,2022,49(8):150-156. [16]LEE C,MOON J H.Robust lane detection and tracking for real-time applications [J].IEEE Transactions on Intelligent Transportation Systems,2018,19(12):4043-4048. [17]DOUMA A,SENGUL G,SALEM F,et al.Applying the histogram of oriented gradients to recognize arabic letters[C]//IEEE 1st International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering MI-STA.IEEE,2021:350-355. [18]LI G,YU Y.Contrast-oriented deep neural networks for salient object detection [J].IEEE Transactions on Neural Networks & Learning Systems,2018,29(12):6038-6051. [19]CHEN M,YU L,ZHI C,et al.Improved faster R-CNN for fabric defect detection based on Gabor filter with genetic algorithm optimization [J].Computers in Industry,2022,134(1):207-214. [20]LUGO G,HAJARI N,REDDY A,et al.Textureless object recognition using an RGB-D sensor[C]//Proceedings of International Conference on Smart Multimedia.Cham:Springer,2019:13-27. [21]LI F,JIN W,FAN C,et al.PSANet:Pyramid splitting and aggregation network for 3d object detection in point cloud [J].Sensors,2020,21(1):136-149. [22]YAN D,LI G,LI X,et al.An improved faster R-CNN method to detect tailings ponds from high-resolution remote sensing images [J].Remote Sensing,2021,13(11):2052-2063. [23]QI C R,LITANY O,HE K,et al.Deep hough voting for 3d object detection in point clouds[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Computer Society Press,2019:9277-9286. [24]CHENG B,SHENG L,SHI S,et al.Back-tracing representative points for voting-based 3d object detection in point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:8963-8972. [25]QI C R,LIU W,WU C,et al.Frustum pointnets for 3d object detection from rgbd data[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Rrecognition.2018:918-927. [26]WANG Z,JIA K.Frustum convnet:Sliding frustums to aggre-gate local point-wise features for amodal 3d object detection[C]//2019 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2019:1742-1749. [27]SONG S R,LICHTENBERG S P,XIAO J X.SUN RGB-D:a rgb-d scene understanding benchmark suite[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2015:567-576. [28]JADERBERG M,SIMONYAN K,ZISSERMAN A.Spatialtransformer networks[C]//Proceedings of Advances in Neural Information Processing Systems.2015:2017-2025. [29]KOSSAIFI J,BULAT A,TZIMIROPOULOS G,et al.T-Net:Parametrizing fully convolutional nets with a single high-order tensor[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2019:7822-7831. [30]QI C R,SU H,MO K,et al.PointNet:deep learning on point set for 3d classification and segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Computer Society Press,2017:77-85. [31]KENDALL A,GAL Y,CIPOLLA R.Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2018:7482-7491. [32]SONG S,XIAO J.Deep sliding shapes for amodal 3d object detection in rgb-d images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:808-816. [33]REN Z,SUDDERTH E B.Three-dimensional object detectionand layout prediction using clouds of oriented gradients[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:1525-1533. [34]LAHOUD J,GHANEM B.2d-driven 3d object detection in rgb-d images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Computer Society Press,2017:4622-4630. [35]SHEN X,STAMOS I.Frustum VoxNet for 3D object detection from RGB-D or Depth images[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2020:1698-1706. [36]ZHANG Z,SUN B,YANG H,et al.H3DNet:3d object detection using hybrid geometric primitives[C]//Proceedings of European Conference on Computer Vision.Cham:Springer,2020:311-329. |
|