Computer Science ›› 2023, Vol. 50 ›› Issue (2): 178-189.doi: 10.11896/jsjkx.211200164

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Survey of Rigid Object Pose Estimation Algorithms Based on Deep Learning

GUO Nan, LI Jingyuan, REN Xi   

  1. School of Computer Science and Engineering,Northeastern University,Shenyang 110167,China
  • Received:2021-12-14 Revised:2022-03-02 Online:2023-02-15 Published:2023-02-22
  • Supported by:
    National Natural Science Foundation of China(52130403) and Fundamental Research Funds for the Central Universities of Ministry of Education of China(N2017003)

Abstract: Rigid object pose estimation aims to obtain 3D translation and 3D rotation information of the rigid object in the camera coordinate system,which plays an important role in rapidly developing fields such as autonomous driving,robotics and augmented reality.The representative papers on rigid object pose estimation based on deep learning from 2017 to 2021 are summarized and analyzed.The rigid object pose estimation methods are divided into coordinate-based,keypoints-based and template-based me-thods.The rigid object pose estimation task is divided into four sub-tasks:image preprocessing,spatial mapping or feature ma-tching,pose recovery,and pose optimization.The subtask realization of each method and its advantages and problems are introduced in detail.The challenges of rigid object pose estimation are analyzed,and the existing solutions and their advantages and disadvantages are summarized.Based on the rigid object pose estimation method,the articulated object and deformable object pose estimation are analyzed.The common datasets and performance evaluation indexes of rigid object pose estimation are introduced,and the performance of existing methods on common datasets is compared and analyzed.Finally,the future research directions of pose tracking and class rigid object pose estimation are prospected.

Key words: Computer vision, Rigid object, Pose estimation, Pose optimization, Deep learning

CLC Number: 

  • TP391
[1]LOWE D G.Object recognition from local scale-invariant fea-tures[C]//Proceedings of the IEEE International Conference on Computer Vision.Kerkyra:IEEE,1999:1150-1157.
[2]BRÉGIER R,DEVERNAY F,LEYRIT L,et al.Defining thePose of any 3D Rigid Object and an Associated Distance[J].International Journal of Computer Vision,2018,126(6):571-596.
[3]WOHLHART P,LEPETIT V.Learning Descriptors for Object Recognition and 3D Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:3109-3118.
[4]COLLET A,BERENSON D,SRINIVASA S S,et al.ObjectRecognition and Full Pose Registration from a Single Image for Robotic Manipulation[C]//IEEE International Conference on Robotics & Automation.Kobe:IEEE,2009:48-55.
[5]DETRY R,PUGEAULT N,PIATER J H.A ProbabilisticFramework for 3D Visual Object Representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(10):1790-1803.
[6]GU C,REN X.Discriminative Mixture-of-Templates for Viewpoint Classification[C]//European Conference on Computer Vision.Berlin:Springer,2010:408-421.
[7]SHI Y,HUANG J,XU X,et al.StablePose:Learning 6D Object Poses from Geometrically Stable Patches[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:15222-15231.
[8]CORONA E,KUNDU K,FIDLER S.Pose Estimation for Objects with Rotational Symmetry[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).Madrid:IEEE,2018:7215-7222.
[9]RAD M,LEPETIT V.BB8:A Scalable,Accurate,Robust toPartial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:IEEE,2017:3828-3836.
[10]WANG C,XU D,ZHU Y,et al.Densefusion:6D Object PoseEstimation by Iterative Dense Fusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:3343-3352.
[11]WANG C,MARTÍN-MARTÍN R,XU D,et al.6-PACK:Category-Level 6D Pose Tracker with Anchor-Based Keypoints[C]//2020 IEEE International Conference on Robotics and Automation(ICRA).Paris:IEEE,2020:10059-10066.
[12]BRACHMANN E,KRULL A,MICHEL F,et al.Learning 6D Object Pose Estimation Using 3D Object Coordinates[C]//European Conference on Computer Vision.Cham:Springer,2014:536-551.
[13]BRACHMANN E,MICHEL F,KRULL A,et al.Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:3364-3372.
[14]HODAN T,HALUZA P,OBDRŽÁLEK Š,et al.T-LESS:An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects[C]//2017 IEEE Winter Conference on Applications of Computer Vision(WACV).Santa Rosa:IEEE,2017:880-888.
[15]HU Y,HUGONOT J,FUA P,et al.Segmentation-Driven 6D Object Pose Estimation[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:3385-3394.
[16]HU Y,FUA P,WANG W,et al.Single-Stage 6D Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:2930-2939.
[17]PHAM Q H,NGUYEN T,HUA B S,et al.Jsis3D:Joint Semantic-Instance Segmentation of 3D Point Clouds with Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:8827-8836.
[18]PHAM Q H,UY M A,HUA B S,et al.LCD:Learned Cross-Domain Descriptors for 2D-3D Matching[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI,2020:11856-11864.
[19]SAHIN C,GARCIA-HERNANDO G,SOCK J,et al.A Review on Object Pose Recovery:From 3D Bounding Box Detectors to Full 6D Pose Estimators[J].Image and Vision Computing,2020,96:103898.
[20]DU G,WANG K,LIAN S,et al.Vision-Based Robotic Grasping from Object Localization,Object Pose Estimation to Grasp Estimation for Parallel Grippers:A Review[J].Artificial Intelligence Review,2021,54(3):1677-1734.
[21]YANG B Y,DU X P,WAN Z Q,et al.A Review of Attitude Estimation Methods for Rigid Object in Single Image[J].Journal of Image and Graphics,2021,26(2):334-354.
[22]PARK K,PATTEN T,VINCZE M.Pix2Pose:Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:7668-7677.
[23]WANG G,MANHARDT F,TOMBARI F,et al.GDR-Net:Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:16611-16621.
[24]ZAKHAROV S,SHUGUROV I,ILIC S.DPOD:6D Pose Object Detector and Refiner[C]//Proceedings of the IEEE Interna-tional Conference on Computer Vision.Seoul:IEEE,2019:1941-1950.
[25]CHEN W,JIA X,CHANG H J,et al.G2L-Net:Global to Local Network for Real-Time 6D Pose Estimation With Embedding Vector Features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:4233-4242.
[26]WANG H,SRIDHAR S,HUANG J,et al.Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:2642-2651.
[27]PENG S,LIU Y,HUANG Q,et al.PVNet:Pixel-Wise Voting Network for 6D of Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:4561-4570.
[28]OBERWEGER M,RAD M,LEPETIT V.Making Deep Heat-maps Robust to Partial Occlusions for 3D Object Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:119-134.
[29]YANG Z,YU X,YANG Y.DSC-PoseNet:Learning 6D of Object Pose Estimation via Dual-Scale Consistency[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:3907-3916.
[30]HE Y,HUANG H,FAN H,et al.FFB6D:A Full Flow Bidirectional Fusion Network for 6D Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:3003-3013.
[31]HE Y,SUN W,HUANG H,et al.PVN3D:A Deep Point-Wise 3D Keypoints Voting Network for 6Dof Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11632-11641.
[32]SUNDERMEYER M,MARTON Z C,DURNER M,et al.Augmented Autoencoders:Implicit 3D Orientation Learning for 6D Object Detection[J].International Journal of Computer Vision,2020,128(3):714-729.
[33]KEHL W,MANHARDT F,TOMBARI F,et al.SSD-6D:Ma-king RGB-Based 3D Detection and 6D Pose Estimation Great Again[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:IEEE,2017:1521-1529.
[34]XIANG Y,SCHMIDT T,NARAYANAN V,et al.PoseCNN:A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes[J].arXiv:1711.00199,2017.
[35]FISCHLER M A,BOLLES R C.Random Sample Consensus:A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography[J].Communications of the ACM,1981,24(6):381-395.
[36]LEPETIT V,MORENO-NOGUER F,FUA P.Epnp:An Accurate O(N) Solution to the PnP Problem[J].International Journal of Computer Vision,2009,81(2):155-166.
[37]MAKAY B P.A Method for Registration of 3D Shape[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1992,14:239-256.
[38]LI Y,WANG G,JI X,et al.DeepIM:Deep Iterative Matching for 6D Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:683-698.
[39]TEKIN B,SINHA S N,FUA P.Real-time Seamless Single Shot 6D Object Pose Prediction[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:292-301.
[40]CHEN D,LI J,WANG Z,et al.Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11973-11982.
[41]LI Z,WANG G,JI X.CDPN:Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-D of Object Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:7678-7687.
[42]WADA K,SUCAR E,JAMES S,et al.MoreFusion:Multi-Ob-ject Reasoning for 6D Pose Estimation from Volumetric Fusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:14540-14549.
[43]MICHEL F,KIRILLOV A,BRACHMANN E,et al.Global Hypothesis Generation for 6D Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:462-471.
[44]MANHARDT F,KEHL W,NAVAB N,et al.Deep Model-Based 6D Pose Refinement in RGB[C]//European Conference on Computer Vision.Munich:Springer,2018:800-815.
[45]LABBÉ Y,CARPENTIER J,AUBRY M,et al.CosyPose:Consistent Multi-View Multi-Object 6D Pose Estimation[C]//European Conference on Computer Vision.Cham:Springer,2020:574-591.
[46]PARK K,MOUSAVIAN A,XIANG Y,et al.LatentFusion:End-To-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:10710-10719.
[47]CAI M,REID I.Reconstruct Locally,Localize Globally:A Mo-del Free Method for Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:3153-3163.
[48]LI Z,HU Y,SALZMANN M,et al.SD-Pose:Semantic Decomposition for Cross-Domain 6D Object Pose Estimation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Vancouver:AAAI,2021:2020-2028.
[49]REDMON J,FARHADI A.Yolov3:An Incremental Improve-ment[J].arXiv:1804.02767,2018.
[50]UMEYAMA S.Least-Squares Estimation of TransformationParameters Between Two Point Patterns[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,1991,13(4):376-380.
[51]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:652-660.
[52]HODANˇ T,VINEET V,GAL R,et al.Photorealistic Image Synthesis for Object Instance Detection[C]//2019 IEEE International Conference on Image Processing(ICIP).Taipei:IEEE,2019:66-70.
[53]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:7263-7271.
[54]SONG C,SONG J,HUANG Q.HybridPose:6D Object Pose Estimation under Hybrid Representations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:431-440.
[55]GEORGAKIS G,KARANAM S,WU Z,et al.Learning LocalRGB-to-CAD Correspondences for Object Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:8967-8976.
[56]QI C R,YI L,SU H,et al.PointNet++:Deep Hierarchical Feature Learning on Point Sets in a Metric Space[J].arXiv:1706.02413,2017.
[57]SUNDERMEYER M,DURNER M,PUANG E Y,et al.Multi-path Learning for Object Pose Estimation Across Domains[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:13916-13925.
[58]PITTERI G,RAMAMONJISOA M,ILIC S,et al.On Object Symmetries and 6D Pose Estimation from Images[C]//2019 International Conference on 3D Vision(3DV).Quebec City:IEEE,2019:614-622.
[59]NAVANEET K L,MATHEW A,KASHYAP S,et al.FromImage Collections to Point Clouds with Self-Supervised Shape and Pose Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:1132-1140.
[60]MANHARDT F,ARROYO D M,RUPPRECHT C,et al.Explaining the Ambiguity of Object Detection and 6D Pose from Visual Data[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.Seoul:IEEE,2019:6841-6850.
[61]LI S F,SHI Z L,ZHUANG C G.Deep Learning-Based 6D Object Pose Estimation Method from Point Clouds[J].Computer Engineering,2021,47(8):216-223.
[62]LI X,WANG H,YI L,et al.Category-Level Articulated Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:3706-3715.
[63]PAVLASEK J,LEWIS S,DESINGH K,et al.Parts-Based Articulated Object Localization in Clutter Using Belief Propagation[C]//2020 IEEE International Conference on Intelligent Robots and Systems(IROS).Las Vegas:IEEE,2020:10595-10602.
[64]CHI C,SONG S.GarmentNets:Category-Level Pose Estimation for Garments via Canonical Space Shape Completion[J].arXiv:2104.05177,2021.
[65]WANG G,MANHARDT F,SHAO J,et al.Self6D:Self-Supervised Monocular 6D Object Pose Estimation[C]//European Conference on Computer Vision.Cham:Springer,2020:108-125.
[66]HINTERSTOISSER S,LEPETIT V,ILIC S,et al.Model Based Training,Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes[C]//Asian Conference on Computer Vision.Berlin:Springer,2012:548-562.
[67]KASKMAN R,ZAKHAROV S,SHUGUROV I,et al.HomebrewedDB:RGB-D Dataset for 6D Pose Estimation of 3D Objects[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.Seoul:IEEE,2019:2767-2776.
[68]YUAN H,HOOGENKAMP T,VELTKAMP R C.RobotP:ABenchmark Dataset for 6D Object Pose Estimation[J].Sensors,2021,21(4):1299.
[69]LI C,BAI J,HAGER G D.A Unified Framework for Multi-View Multi-Class Object Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:254-269.
[70]CHEN W,JIA X,CHANG H J,et al.FS-Net:Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:1581-1590.
[71]DENG X,MOUSAVIAN A,XIANG Y,et al.PoseRBPF:ARao-Blackwellized Particle Filter for 6-D Object Pose Tracking[J].IEEE Transactions on Robotics,2021,37:1328-1342.
[72]BAUER D,PATTEN T,VINCZE M.ReAgent:Point CloudRegistration Using Imitation and Reinforcement Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:14586-14594.
[73]SHAO J,JIANG Y,WANG G,et al.PFRL:Pose-free Reinforcement Learning for 6D Pose Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11454-11463.
[74]SOCK J,GARCIA-HERNANDO G,KIM T K.Active 6D Multi-Object Pose Estimation in Cluttered Scenarios with Deep Reinforcement Learning[C]//2020 IEEE/RSJ International Confe-rence on Intelligent Robots and Systems(IROS).Las Vegas:IEEE,2020:10564-10571.
[75]KRULL A,BRACHMANN E,NOWOZIN S,et al.PoseAgent:Budget-constrained 6D Object Pose Estimation via Reinforcement Learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:6702-6710.
[76]JIANG M,CHEN Y,ZHOU Q h,et al.Lightweight Pose Estimation Network for Non-Cooperative Target Acquisition[J].Computer Engineering,2022,48(6):235-242.
[1] BAI Xuefei, MA Yanan, WANG Wenjian. Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J]. Computer Science, 2023, 50(3): 199-207.
[2] LIU Hang, PU Yuanyuan, LYU Dahua, ZHAO Zhengpeng, XU Dan, QIAN Wenhua. Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image [J]. Computer Science, 2023, 50(3): 208-215.
[3] CHEN Liang, WANG Lu, LI Shengchun, LIU Changhong. Study on Visual Dashboard Generation Technology Based on Deep Learning [J]. Computer Science, 2023, 50(3): 238-245.
[4] ZHANG Yi, WU Qin. Crowd Counting Network Based on Feature Enhancement Loss and Foreground Attention [J]. Computer Science, 2023, 50(3): 246-253.
[5] YING Zonghao, WU Bin. Backdoor Attack on Deep Learning Models:A Survey [J]. Computer Science, 2023, 50(3): 333-350.
[6] DONG Yongfeng, HUANG Gang, XUE Wanruo, LI Linhao. Graph Attention Deep Knowledge Tracing Model Integrated with IRT [J]. Computer Science, 2023, 50(3): 173-180.
[7] HUA Xiaofeng, FENG Na, YU Junqing, HE Yunfeng. Shooting Event Detection of Free Kick in Soccer Video Based on Rule Reasoning [J]. Computer Science, 2023, 50(3): 181-190.
[8] MEI Pengcheng, YANG Jibin, ZHANG Qiang, HUANG Xiang. Sound Event Joint Estimation Method Based on Three-dimension Convolution [J]. Computer Science, 2023, 50(3): 191-198.
[9] LIANG Jiali, HUA Baojian, SU Shaobo. Tensor Instruction Generation Optimization Fusing with Loop Partitioning [J]. Computer Science, 2023, 50(2): 374-383.
[10] ZOU Yunzhu, DU Shengdong, TENG Fei, LI Tianrui. Visual Question Answering Model Based on Multi-modal Deep Feature Fusion [J]. Computer Science, 2023, 50(2): 123-129.
[11] WANG Pengyu, TAI Wenxin, LIU Fang, ZHONG Ting, LUO Xucheng, ZHOU Fan. Self-supervised Flight Trajectory Prediction Based on Data Augmentation [J]. Computer Science, 2023, 50(2): 130-137.
[12] LI Junlin, OUYANG Zhi, DU Nisuo. Scene Text Detection with Improved Region Proposal Network [J]. Computer Science, 2023, 50(2): 201-208.
[13] HUA Jie, LIU Xueliang, ZHAO Ye. Few-shot Object Detection Based on Feature Fusion [J]. Computer Science, 2023, 50(2): 209-213.
[14] LI Xuehui, ZHANG Yongjun, SHI Dianxi, XU Huachi, SHI Yanyan. AFTM:Anchor-free Object Tracking Method with Attention Features [J]. Computer Science, 2023, 50(1): 138-146.
[15] SUN Kaili, LUO Xudong , Michael Y.LUO. Survey of Applications of Pretrained Language Models [J]. Computer Science, 2023, 50(1): 176-184.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!