基于深度学习的刚体位姿估计方法综述

doi:10.11896/jsjkx.211200164

摘要/Abstract

摘要： 刚体位姿估计旨在获取刚体在相机坐标系下的3D平移信息和3D旋转信息,在自动驾驶、机器人、增强现实等快速发展的领域起着重要作用。现对2017－2021年间的基于深度学习的刚体位姿估计方向具有代表性的研究进行汇总与分析。将刚体位姿估计的方法分为基于坐标、基于关键点和基于模板的方法。将刚体位姿估计任务划分为图像预处理、空间映射或特征匹配、位姿恢复和位姿优化4项子任务,详细介绍每一类方法的子任务实现及其优势和存在的问题。分析刚体位姿估计任务面临的挑战,总结现有解决方案及其优缺点。介绍刚体位姿估计常用的数据集和性能评价指标,并对比分析现有方法在常用数据集上的表现。最后从位姿跟踪、类别级位姿估计等多个角度对未来研究方向进行了展望。

关键词: 计算机视觉, 刚体目标, 位姿估计, 位姿优化, 深度学习

Abstract: Rigid object pose estimation aims to obtain 3D translation and 3D rotation information of the rigid object in the camera coordinate system,which plays an important role in rapidly developing fields such as autonomous driving,robotics and augmented reality.The representative papers on rigid object pose estimation based on deep learning from 2017 to 2021 are summarized and analyzed.The rigid object pose estimation methods are divided into coordinate-based,keypoints-based and template-based me-thods.The rigid object pose estimation task is divided into four sub-tasks:image preprocessing,spatial mapping or feature ma-tching,pose recovery,and pose optimization.The subtask realization of each method and its advantages and problems are introduced in detail.The challenges of rigid object pose estimation are analyzed,and the existing solutions and their advantages and disadvantages are summarized.Based on the rigid object pose estimation method,the articulated object and deformable object pose estimation are analyzed.The common datasets and performance evaluation indexes of rigid object pose estimation are introduced,and the performance of existing methods on common datasets is compared and analyzed.Finally,the future research directions of pose tracking and class rigid object pose estimation are prospected.

Key words: Computer vision, Rigid object, Pose estimation, Pose optimization, Deep learning

中图分类号:

TP391

郭楠, 李婧源, 任曦. 基于深度学习的刚体位姿估计方法综述[J]. 计算机科学, 2023, 50(2): 178-189. https://doi.org/10.11896/jsjkx.211200164

GUO Nan, LI Jingyuan, REN Xi. Survey of Rigid Object Pose Estimation Algorithms Based on Deep Learning[J]. Computer Science, 2023, 50(2): 178-189. https://doi.org/10.11896/jsjkx.211200164

参考文献

[1]LOWE D G.Object recognition from local scale-invariant fea-tures[C]//Proceedings of the IEEE International Conference on Computer Vision.Kerkyra:IEEE,1999:1150-1157.
[2]BRÉGIER R,DEVERNAY F,LEYRIT L,et al.Defining thePose of any 3D Rigid Object and an Associated Distance[J].International Journal of Computer Vision,2018,126(6):571-596.
[3]WOHLHART P,LEPETIT V.Learning Descriptors for Object Recognition and 3D Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:3109-3118.
[4]COLLET A,BERENSON D,SRINIVASA S S,et al.ObjectRecognition and Full Pose Registration from a Single Image for Robotic Manipulation[C]//IEEE International Conference on Robotics & Automation.Kobe:IEEE,2009:48-55.
[5]DETRY R,PUGEAULT N,PIATER J H.A ProbabilisticFramework for 3D Visual Object Representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(10):1790-1803.
[6]GU C,REN X.Discriminative Mixture-of-Templates for Viewpoint Classification[C]//European Conference on Computer Vision.Berlin:Springer,2010:408-421.
[7]SHI Y,HUANG J,XU X,et al.StablePose:Learning 6D Object Poses from Geometrically Stable Patches[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:15222-15231.
[8]CORONA E,KUNDU K,FIDLER S.Pose Estimation for Objects with Rotational Symmetry[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).Madrid:IEEE,2018:7215-7222.
[9]RAD M,LEPETIT V.BB8:A Scalable,Accurate,Robust toPartial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:IEEE,2017:3828-3836.
[10]WANG C,XU D,ZHU Y,et al.Densefusion:6D Object PoseEstimation by Iterative Dense Fusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:3343-3352.
[11]WANG C,MARTÍN-MARTÍN R,XU D,et al.6-PACK:Category-Level 6D Pose Tracker with Anchor-Based Keypoints[C]//2020 IEEE International Conference on Robotics and Automation(ICRA).Paris:IEEE,2020:10059-10066.
[12]BRACHMANN E,KRULL A,MICHEL F,et al.Learning 6D Object Pose Estimation Using 3D Object Coordinates[C]//European Conference on Computer Vision.Cham:Springer,2014:536-551.
[13]BRACHMANN E,MICHEL F,KRULL A,et al.Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:3364-3372.
[14]HODAN T,HALUZA P,OBDRŽÁLEK Š,et al.T-LESS:An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects[C]//2017 IEEE Winter Conference on Applications of Computer Vision(WACV).Santa Rosa:IEEE,2017:880-888.
[15]HU Y,HUGONOT J,FUA P,et al.Segmentation-Driven 6D Object Pose Estimation[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:3385-3394.
[16]HU Y,FUA P,WANG W,et al.Single-Stage 6D Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:2930-2939.
[17]PHAM Q H,NGUYEN T,HUA B S,et al.Jsis3D:Joint Semantic-Instance Segmentation of 3D Point Clouds with Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:8827-8836.
[18]PHAM Q H,UY M A,HUA B S,et al.LCD:Learned Cross-Domain Descriptors for 2D-3D Matching[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI,2020:11856-11864.
[19]SAHIN C,GARCIA-HERNANDO G,SOCK J,et al.A Review on Object Pose Recovery:From 3D Bounding Box Detectors to Full 6D Pose Estimators[J].Image and Vision Computing,2020,96:103898.
[20]DU G,WANG K,LIAN S,et al.Vision-Based Robotic Grasping from Object Localization,Object Pose Estimation to Grasp Estimation for Parallel Grippers:A Review[J].Artificial Intelligence Review,2021,54(3):1677-1734.
[21]YANG B Y,DU X P,WAN Z Q,et al.A Review of Attitude Estimation Methods for Rigid Object in Single Image[J].Journal of Image and Graphics,2021,26(2):334-354.
[22]PARK K,PATTEN T,VINCZE M.Pix2Pose:Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:7668-7677.
[23]WANG G,MANHARDT F,TOMBARI F,et al.GDR-Net:Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:16611-16621.
[24]ZAKHAROV S,SHUGUROV I,ILIC S.DPOD:6D Pose Object Detector and Refiner[C]//Proceedings of the IEEE Interna-tional Conference on Computer Vision.Seoul:IEEE,2019:1941-1950.
[25]CHEN W,JIA X,CHANG H J,et al.G2L-Net:Global to Local Network for Real-Time 6D Pose Estimation With Embedding Vector Features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:4233-4242.
[26]WANG H,SRIDHAR S,HUANG J,et al.Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:2642-2651.
[27]PENG S,LIU Y,HUANG Q,et al.PVNet:Pixel-Wise Voting Network for 6D of Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:4561-4570.
[28]OBERWEGER M,RAD M,LEPETIT V.Making Deep Heat-maps Robust to Partial Occlusions for 3D Object Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:119-134.
[29]YANG Z,YU X,YANG Y.DSC-PoseNet:Learning 6D of Object Pose Estimation via Dual-Scale Consistency[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:3907-3916.
[30]HE Y,HUANG H,FAN H,et al.FFB6D:A Full Flow Bidirectional Fusion Network for 6D Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:3003-3013.
[31]HE Y,SUN W,HUANG H,et al.PVN3D:A Deep Point-Wise 3D Keypoints Voting Network for 6Dof Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11632-11641.
[32]SUNDERMEYER M,MARTON Z C,DURNER M,et al.Augmented Autoencoders:Implicit 3D Orientation Learning for 6D Object Detection[J].International Journal of Computer Vision,2020,128(3):714-729.
[33]KEHL W,MANHARDT F,TOMBARI F,et al.SSD-6D:Ma-king RGB-Based 3D Detection and 6D Pose Estimation Great Again[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:IEEE,2017:1521-1529.
[34]XIANG Y,SCHMIDT T,NARAYANAN V,et al.PoseCNN:A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes[J].arXiv:1711.00199,2017.
[35]FISCHLER M A,BOLLES R C.Random Sample Consensus:A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography[J].Communications of the ACM,1981,24(6):381-395.
[36]LEPETIT V,MORENO-NOGUER F,FUA P.Epnp:An Accurate O(N) Solution to the PnP Problem[J].International Journal of Computer Vision,2009,81(2):155-166.
[37]MAKAY B P.A Method for Registration of 3D Shape[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1992,14:239-256.
[38]LI Y,WANG G,JI X,et al.DeepIM:Deep Iterative Matching for 6D Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:683-698.
[39]TEKIN B,SINHA S N,FUA P.Real-time Seamless Single Shot 6D Object Pose Prediction[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:292-301.
[40]CHEN D,LI J,WANG Z,et al.Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11973-11982.
[41]LI Z,WANG G,JI X.CDPN:Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-D of Object Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:7678-7687.
[42]WADA K,SUCAR E,JAMES S,et al.MoreFusion:Multi-Ob-ject Reasoning for 6D Pose Estimation from Volumetric Fusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:14540-14549.
[43]MICHEL F,KIRILLOV A,BRACHMANN E,et al.Global Hypothesis Generation for 6D Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:462-471.
[44]MANHARDT F,KEHL W,NAVAB N,et al.Deep Model-Based 6D Pose Refinement in RGB[C]//European Conference on Computer Vision.Munich:Springer,2018:800-815.
[45]LABBÉ Y,CARPENTIER J,AUBRY M,et al.CosyPose:Consistent Multi-View Multi-Object 6D Pose Estimation[C]//European Conference on Computer Vision.Cham:Springer,2020:574-591.
[46]PARK K,MOUSAVIAN A,XIANG Y,et al.LatentFusion:End-To-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:10710-10719.
[47]CAI M,REID I.Reconstruct Locally,Localize Globally:A Mo-del Free Method for Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:3153-3163.
[48]LI Z,HU Y,SALZMANN M,et al.SD-Pose:Semantic Decomposition for Cross-Domain 6D Object Pose Estimation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Vancouver:AAAI,2021:2020-2028.
[49]REDMON J,FARHADI A.Yolov3:An Incremental Improve-ment[J].arXiv:1804.02767,2018.
[50]UMEYAMA S.Least-Squares Estimation of TransformationParameters Between Two Point Patterns[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,1991,13(4):376-380.
[51]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:652-660.
[52]HODANˇ T,VINEET V,GAL R,et al.Photorealistic Image Synthesis for Object Instance Detection[C]//2019 IEEE International Conference on Image Processing(ICIP).Taipei:IEEE,2019:66-70.
[53]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:7263-7271.
[54]SONG C,SONG J,HUANG Q.HybridPose:6D Object Pose Estimation under Hybrid Representations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:431-440.
[55]GEORGAKIS G,KARANAM S,WU Z,et al.Learning LocalRGB-to-CAD Correspondences for Object Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:8967-8976.
[56]QI C R,YI L,SU H,et al.PointNet++:Deep Hierarchical Feature Learning on Point Sets in a Metric Space[J].arXiv:1706.02413,2017.
[57]SUNDERMEYER M,DURNER M,PUANG E Y,et al.Multi-path Learning for Object Pose Estimation Across Domains[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:13916-13925.
[58]PITTERI G,RAMAMONJISOA M,ILIC S,et al.On Object Symmetries and 6D Pose Estimation from Images[C]//2019 International Conference on 3D Vision(3DV).Quebec City:IEEE,2019:614-622.
[59]NAVANEET K L,MATHEW A,KASHYAP S,et al.FromImage Collections to Point Clouds with Self-Supervised Shape and Pose Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:1132-1140.
[60]MANHARDT F,ARROYO D M,RUPPRECHT C,et al.Explaining the Ambiguity of Object Detection and 6D Pose from Visual Data[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.Seoul:IEEE,2019:6841-6850.
[61]LI S F,SHI Z L,ZHUANG C G.Deep Learning-Based 6D Object Pose Estimation Method from Point Clouds[J].Computer Engineering,2021,47(8):216-223.
[62]LI X,WANG H,YI L,et al.Category-Level Articulated Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:3706-3715.
[63]PAVLASEK J,LEWIS S,DESINGH K,et al.Parts-Based Articulated Object Localization in Clutter Using Belief Propagation[C]//2020 IEEE International Conference on Intelligent Robots and Systems(IROS).Las Vegas:IEEE,2020:10595-10602.
[64]CHI C,SONG S.GarmentNets:Category-Level Pose Estimation for Garments via Canonical Space Shape Completion[J].arXiv:2104.05177,2021.
[65]WANG G,MANHARDT F,SHAO J,et al.Self6D:Self-Supervised Monocular 6D Object Pose Estimation[C]//European Conference on Computer Vision.Cham:Springer,2020:108-125.
[66]HINTERSTOISSER S,LEPETIT V,ILIC S,et al.Model Based Training,Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes[C]//Asian Conference on Computer Vision.Berlin:Springer,2012:548-562.
[67]KASKMAN R,ZAKHAROV S,SHUGUROV I,et al.HomebrewedDB:RGB-D Dataset for 6D Pose Estimation of 3D Objects[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.Seoul:IEEE,2019:2767-2776.
[68]YUAN H,HOOGENKAMP T,VELTKAMP R C.RobotP:ABenchmark Dataset for 6D Object Pose Estimation[J].Sensors,2021,21(4):1299.
[69]LI C,BAI J,HAGER G D.A Unified Framework for Multi-View Multi-Class Object Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:254-269.
[70]CHEN W,JIA X,CHANG H J,et al.FS-Net:Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:1581-1590.
[71]DENG X,MOUSAVIAN A,XIANG Y,et al.PoseRBPF:ARao-Blackwellized Particle Filter for 6-D Object Pose Tracking[J].IEEE Transactions on Robotics,2021,37:1328-1342.
[72]BAUER D,PATTEN T,VINCZE M.ReAgent:Point CloudRegistration Using Imitation and Reinforcement Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:14586-14594.
[73]SHAO J,JIANG Y,WANG G,et al.PFRL:Pose-free Reinforcement Learning for 6D Pose Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11454-11463.
[74]SOCK J,GARCIA-HERNANDO G,KIM T K.Active 6D Multi-Object Pose Estimation in Cluttered Scenarios with Deep Reinforcement Learning[C]//2020 IEEE/RSJ International Confe-rence on Intelligent Robots and Systems(IROS).Las Vegas:IEEE,2020:10564-10571.
[75]KRULL A,BRACHMANN E,NOWOZIN S,et al.PoseAgent:Budget-constrained 6D Object Pose Estimation via Reinforcement Learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:6702-6710.
[76]JIANG M,CHEN Y,ZHOU Q h,et al.Lightweight Pose Estimation Network for Non-Cooperative Target Acquisition[J].Computer Engineering,2022,48(6):235-242.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed