计算机科学 ›› 2023, Vol. 50 ›› Issue (2): 178-189.doi: 10.11896/jsjkx.211200164
郭楠, 李婧源, 任曦
GUO Nan, LI Jingyuan, REN Xi
摘要: 刚体位姿估计旨在获取刚体在相机坐标系下的3D平移信息和3D旋转信息,在自动驾驶、机器人、增强现实等快速发展的领域起着重要作用。现对2017-2021年间的基于深度学习的刚体位姿估计方向具有代表性的研究进行汇总与分析。将刚体位姿估计的方法分为基于坐标、基于关键点和基于模板的方法。将刚体位姿估计任务划分为图像预处理、空间映射或特征匹配、位姿恢复和位姿优化4项子任务,详细介绍每一类方法的子任务实现及其优势和存在的问题。分析刚体位姿估计任务面临的挑战,总结现有解决方案及其优缺点。介绍刚体位姿估计常用的数据集和性能评价指标,并对比分析现有方法在常用数据集上的表现。最后从位姿跟踪、类别级位姿估计等多个角度对未来研究方向进行了展望。
中图分类号:
[1]LOWE D G.Object recognition from local scale-invariant fea-tures[C]//Proceedings of the IEEE International Conference on Computer Vision.Kerkyra:IEEE,1999:1150-1157. [2]BRÉGIER R,DEVERNAY F,LEYRIT L,et al.Defining thePose of any 3D Rigid Object and an Associated Distance[J].International Journal of Computer Vision,2018,126(6):571-596. [3]WOHLHART P,LEPETIT V.Learning Descriptors for Object Recognition and 3D Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE,2015:3109-3118. [4]COLLET A,BERENSON D,SRINIVASA S S,et al.ObjectRecognition and Full Pose Registration from a Single Image for Robotic Manipulation[C]//IEEE International Conference on Robotics & Automation.Kobe:IEEE,2009:48-55. [5]DETRY R,PUGEAULT N,PIATER J H.A ProbabilisticFramework for 3D Visual Object Representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(10):1790-1803. [6]GU C,REN X.Discriminative Mixture-of-Templates for Viewpoint Classification[C]//European Conference on Computer Vision.Berlin:Springer,2010:408-421. [7]SHI Y,HUANG J,XU X,et al.StablePose:Learning 6D Object Poses from Geometrically Stable Patches[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:15222-15231. [8]CORONA E,KUNDU K,FIDLER S.Pose Estimation for Objects with Rotational Symmetry[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).Madrid:IEEE,2018:7215-7222. [9]RAD M,LEPETIT V.BB8:A Scalable,Accurate,Robust toPartial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:IEEE,2017:3828-3836. [10]WANG C,XU D,ZHU Y,et al.Densefusion:6D Object PoseEstimation by Iterative Dense Fusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:3343-3352. [11]WANG C,MARTÍN-MARTÍN R,XU D,et al.6-PACK:Category-Level 6D Pose Tracker with Anchor-Based Keypoints[C]//2020 IEEE International Conference on Robotics and Automation(ICRA).Paris:IEEE,2020:10059-10066. [12]BRACHMANN E,KRULL A,MICHEL F,et al.Learning 6D Object Pose Estimation Using 3D Object Coordinates[C]//European Conference on Computer Vision.Cham:Springer,2014:536-551. [13]BRACHMANN E,MICHEL F,KRULL A,et al.Uncertainty-Driven 6D Pose Estimation of Objects and Scenes from a Single RGB Image[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:3364-3372. [14]HODAN T,HALUZA P,OBDRÁLEK ,et al.T-LESS:An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects[C]//2017 IEEE Winter Conference on Applications of Computer Vision(WACV).Santa Rosa:IEEE,2017:880-888. [15]HU Y,HUGONOT J,FUA P,et al.Segmentation-Driven 6D Object Pose Estimation[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:3385-3394. [16]HU Y,FUA P,WANG W,et al.Single-Stage 6D Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:2930-2939. [17]PHAM Q H,NGUYEN T,HUA B S,et al.Jsis3D:Joint Semantic-Instance Segmentation of 3D Point Clouds with Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:8827-8836. [18]PHAM Q H,UY M A,HUA B S,et al.LCD:Learned Cross-Domain Descriptors for 2D-3D Matching[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI,2020:11856-11864. [19]SAHIN C,GARCIA-HERNANDO G,SOCK J,et al.A Review on Object Pose Recovery:From 3D Bounding Box Detectors to Full 6D Pose Estimators[J].Image and Vision Computing,2020,96:103898. [20]DU G,WANG K,LIAN S,et al.Vision-Based Robotic Grasping from Object Localization,Object Pose Estimation to Grasp Estimation for Parallel Grippers:A Review[J].Artificial Intelligence Review,2021,54(3):1677-1734. [21]YANG B Y,DU X P,WAN Z Q,et al.A Review of Attitude Estimation Methods for Rigid Object in Single Image[J].Journal of Image and Graphics,2021,26(2):334-354. [22]PARK K,PATTEN T,VINCZE M.Pix2Pose:Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:7668-7677. [23]WANG G,MANHARDT F,TOMBARI F,et al.GDR-Net:Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:16611-16621. [24]ZAKHAROV S,SHUGUROV I,ILIC S.DPOD:6D Pose Object Detector and Refiner[C]//Proceedings of the IEEE Interna-tional Conference on Computer Vision.Seoul:IEEE,2019:1941-1950. [25]CHEN W,JIA X,CHANG H J,et al.G2L-Net:Global to Local Network for Real-Time 6D Pose Estimation With Embedding Vector Features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:4233-4242. [26]WANG H,SRIDHAR S,HUANG J,et al.Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:2642-2651. [27]PENG S,LIU Y,HUANG Q,et al.PVNet:Pixel-Wise Voting Network for 6D of Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:4561-4570. [28]OBERWEGER M,RAD M,LEPETIT V.Making Deep Heat-maps Robust to Partial Occlusions for 3D Object Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:119-134. [29]YANG Z,YU X,YANG Y.DSC-PoseNet:Learning 6D of Object Pose Estimation via Dual-Scale Consistency[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:3907-3916. [30]HE Y,HUANG H,FAN H,et al.FFB6D:A Full Flow Bidirectional Fusion Network for 6D Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:3003-3013. [31]HE Y,SUN W,HUANG H,et al.PVN3D:A Deep Point-Wise 3D Keypoints Voting Network for 6Dof Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11632-11641. [32]SUNDERMEYER M,MARTON Z C,DURNER M,et al.Augmented Autoencoders:Implicit 3D Orientation Learning for 6D Object Detection[J].International Journal of Computer Vision,2020,128(3):714-729. [33]KEHL W,MANHARDT F,TOMBARI F,et al.SSD-6D:Ma-king RGB-Based 3D Detection and 6D Pose Estimation Great Again[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:IEEE,2017:1521-1529. [34]XIANG Y,SCHMIDT T,NARAYANAN V,et al.PoseCNN:A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes[J].arXiv:1711.00199,2017. [35]FISCHLER M A,BOLLES R C.Random Sample Consensus:A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography[J].Communications of the ACM,1981,24(6):381-395. [36]LEPETIT V,MORENO-NOGUER F,FUA P.Epnp:An Accurate O(N) Solution to the PnP Problem[J].International Journal of Computer Vision,2009,81(2):155-166. [37]MAKAY B P.A Method for Registration of 3D Shape[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1992,14:239-256. [38]LI Y,WANG G,JI X,et al.DeepIM:Deep Iterative Matching for 6D Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:683-698. [39]TEKIN B,SINHA S N,FUA P.Real-time Seamless Single Shot 6D Object Pose Prediction[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:292-301. [40]CHEN D,LI J,WANG Z,et al.Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11973-11982. [41]LI Z,WANG G,JI X.CDPN:Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-D of Object Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:7678-7687. [42]WADA K,SUCAR E,JAMES S,et al.MoreFusion:Multi-Ob-ject Reasoning for 6D Pose Estimation from Volumetric Fusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:14540-14549. [43]MICHEL F,KIRILLOV A,BRACHMANN E,et al.Global Hypothesis Generation for 6D Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:462-471. [44]MANHARDT F,KEHL W,NAVAB N,et al.Deep Model-Based 6D Pose Refinement in RGB[C]//European Conference on Computer Vision.Munich:Springer,2018:800-815. [45]LABBÉ Y,CARPENTIER J,AUBRY M,et al.CosyPose:Consistent Multi-View Multi-Object 6D Pose Estimation[C]//European Conference on Computer Vision.Cham:Springer,2020:574-591. [46]PARK K,MOUSAVIAN A,XIANG Y,et al.LatentFusion:End-To-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:10710-10719. [47]CAI M,REID I.Reconstruct Locally,Localize Globally:A Mo-del Free Method for Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:3153-3163. [48]LI Z,HU Y,SALZMANN M,et al.SD-Pose:Semantic Decomposition for Cross-Domain 6D Object Pose Estimation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Vancouver:AAAI,2021:2020-2028. [49]REDMON J,FARHADI A.Yolov3:An Incremental Improve-ment[J].arXiv:1804.02767,2018. [50]UMEYAMA S.Least-Squares Estimation of TransformationParameters Between Two Point Patterns[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,1991,13(4):376-380. [51]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:652-660. [52]HODANˇ T,VINEET V,GAL R,et al.Photorealistic Image Synthesis for Object Instance Detection[C]//2019 IEEE International Conference on Image Processing(ICIP).Taipei:IEEE,2019:66-70. [53]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:7263-7271. [54]SONG C,SONG J,HUANG Q.HybridPose:6D Object Pose Estimation under Hybrid Representations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:431-440. [55]GEORGAKIS G,KARANAM S,WU Z,et al.Learning LocalRGB-to-CAD Correspondences for Object Pose Estimation[C]//Proceedings of the IEEE International Conference on Computer Vision.Seoul:IEEE,2019:8967-8976. [56]QI C R,YI L,SU H,et al.PointNet++:Deep Hierarchical Feature Learning on Point Sets in a Metric Space[J].arXiv:1706.02413,2017. [57]SUNDERMEYER M,DURNER M,PUANG E Y,et al.Multi-path Learning for Object Pose Estimation Across Domains[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:13916-13925. [58]PITTERI G,RAMAMONJISOA M,ILIC S,et al.On Object Symmetries and 6D Pose Estimation from Images[C]//2019 International Conference on 3D Vision(3DV).Quebec City:IEEE,2019:614-622. [59]NAVANEET K L,MATHEW A,KASHYAP S,et al.FromImage Collections to Point Clouds with Self-Supervised Shape and Pose Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:1132-1140. [60]MANHARDT F,ARROYO D M,RUPPRECHT C,et al.Explaining the Ambiguity of Object Detection and 6D Pose from Visual Data[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.Seoul:IEEE,2019:6841-6850. [61]LI S F,SHI Z L,ZHUANG C G.Deep Learning-Based 6D Object Pose Estimation Method from Point Clouds[J].Computer Engineering,2021,47(8):216-223. [62]LI X,WANG H,YI L,et al.Category-Level Articulated Object Pose Estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:3706-3715. [63]PAVLASEK J,LEWIS S,DESINGH K,et al.Parts-Based Articulated Object Localization in Clutter Using Belief Propagation[C]//2020 IEEE International Conference on Intelligent Robots and Systems(IROS).Las Vegas:IEEE,2020:10595-10602. [64]CHI C,SONG S.GarmentNets:Category-Level Pose Estimation for Garments via Canonical Space Shape Completion[J].arXiv:2104.05177,2021. [65]WANG G,MANHARDT F,SHAO J,et al.Self6D:Self-Supervised Monocular 6D Object Pose Estimation[C]//European Conference on Computer Vision.Cham:Springer,2020:108-125. [66]HINTERSTOISSER S,LEPETIT V,ILIC S,et al.Model Based Training,Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes[C]//Asian Conference on Computer Vision.Berlin:Springer,2012:548-562. [67]KASKMAN R,ZAKHAROV S,SHUGUROV I,et al.HomebrewedDB:RGB-D Dataset for 6D Pose Estimation of 3D Objects[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.Seoul:IEEE,2019:2767-2776. [68]YUAN H,HOOGENKAMP T,VELTKAMP R C.RobotP:ABenchmark Dataset for 6D Object Pose Estimation[J].Sensors,2021,21(4):1299. [69]LI C,BAI J,HAGER G D.A Unified Framework for Multi-View Multi-Class Object Pose Estimation[C]//European Conference on Computer Vision.Munich:Springer,2018:254-269. [70]CHEN W,JIA X,CHANG H J,et al.FS-Net:Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:1581-1590. [71]DENG X,MOUSAVIAN A,XIANG Y,et al.PoseRBPF:ARao-Blackwellized Particle Filter for 6-D Object Pose Tracking[J].IEEE Transactions on Robotics,2021,37:1328-1342. [72]BAUER D,PATTEN T,VINCZE M.ReAgent:Point CloudRegistration Using Imitation and Reinforcement Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:14586-14594. [73]SHAO J,JIANG Y,WANG G,et al.PFRL:Pose-free Reinforcement Learning for 6D Pose Estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:11454-11463. [74]SOCK J,GARCIA-HERNANDO G,KIM T K.Active 6D Multi-Object Pose Estimation in Cluttered Scenarios with Deep Reinforcement Learning[C]//2020 IEEE/RSJ International Confe-rence on Intelligent Robots and Systems(IROS).Las Vegas:IEEE,2020:10564-10571. [75]KRULL A,BRACHMANN E,NOWOZIN S,et al.PoseAgent:Budget-constrained 6D Object Pose Estimation via Reinforcement Learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:6702-6710. [76]JIANG M,CHEN Y,ZHOU Q h,et al.Lightweight Pose Estimation Network for Non-Cooperative Target Acquisition[J].Computer Engineering,2022,48(6):235-242. |
[1] | 董永峰, 黄港, 薛婉若, 李林昊. 融合IRT的图注意力深度知识追踪模型 Graph Attention Deep Knowledge Tracing Model Integrated with IRT 计算机科学, 2023, 50(3): 173-180. https://doi.org/10.11896/jsjkx.211200134 |
[2] | 华晓凤, 冯娜, 于俊清, 何云峰. 基于规则推理的足球视频任意球射门事件检测 Shooting Event Detection of Free Kick in Soccer Video Based on Rule Reasoning 计算机科学, 2023, 50(3): 181-190. https://doi.org/10.11896/jsjkx.220300062 |
[3] | 梅鹏程, 杨吉斌, 张强, 黄翔. 一种基于三维卷积的声学事件联合估计方法 Sound Event Joint Estimation Method Based on Three-dimension Convolution 计算机科学, 2023, 50(3): 191-198. https://doi.org/10.11896/jsjkx.220500259 |
[4] | 白雪飞, 马亚楠, 王文剑. 基于特征融合的边缘引导乳腺超声图像分割方法 Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion 计算机科学, 2023, 50(3): 199-207. https://doi.org/10.11896/jsjkx.211200294 |
[5] | 刘航, 普园媛, 吕大华, 赵征鹏, 徐丹, 钱文华. 极化自注意力约束颜色溢出的图像自动上色 Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image 计算机科学, 2023, 50(3): 208-215. https://doi.org/10.11896/jsjkx.220100149 |
[6] | 陈亮, 王璐, 李生春, 刘昌宏. 基于深度学习的可视化仪表板生成技术研究 Study on Visual Dashboard Generation Technology Based on Deep Learning 计算机科学, 2023, 50(3): 238-245. https://doi.org/10.11896/jsjkx.230100064 |
[7] | 张译, 吴秦. 特征增强损失与前景注意力人群计数网络 Crowd Counting Network Based on Feature Enhancement Loss and Foreground Attention 计算机科学, 2023, 50(3): 246-253. https://doi.org/10.11896/jsjkx.220100219 |
[8] | 应宗浩, 吴槟. 深度学习模型的后门攻击研究综述 Backdoor Attack on Deep Learning Models:A Survey 计算机科学, 2023, 50(3): 333-350. https://doi.org/10.11896/jsjkx.220600031 |
[9] | 邹芸竹, 杜圣东, 滕飞, 李天瑞. 一种基于多模态深度特征融合的视觉问答模型 Visual Question Answering Model Based on Multi-modal Deep Feature Fusion 计算机科学, 2023, 50(2): 123-129. https://doi.org/10.11896/jsjkx.211200303 |
[10] | 王鹏宇, 台文鑫, 刘芳, 钟婷, 罗绪成, 周帆. 基于数据增强的自监督飞行航迹预测 Self-supervised Flight Trajectory Prediction Based on Data Augmentation 计算机科学, 2023, 50(2): 130-137. https://doi.org/10.11896/jsjkx.211200016 |
[11] | 李俊林, 欧阳智, 杜逆索. 基于改进区域候选网络的场景文本检测 Scene Text Detection with Improved Region Proposal Network 计算机科学, 2023, 50(2): 201-208. https://doi.org/10.11896/jsjkx.211000191 |
[12] | 华杰, 刘学亮, 赵烨. 基于特征融合的小样本目标检测 Few-shot Object Detection Based on Feature Fusion 计算机科学, 2023, 50(2): 209-213. https://doi.org/10.11896/jsjkx.220500153 |
[13] | 梁佳利, 华保健, 苏少博. 融合循环划分的张量指令生成优化 Tensor Instruction Generation Optimization Fusing with Loop Partitioning 计算机科学, 2023, 50(2): 374-383. https://doi.org/10.11896/jsjkx.220300147 |
[14] | 蔡肖, 陈志华, 盛斌. 基于移位窗口金字塔Transformer的遥感图像目标检测 SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing 计算机科学, 2023, 50(1): 105-113. https://doi.org/10.11896/jsjkx.211100208 |
[15] | 王斌, 梁宇栋, 刘哲, 张超, 李德玉. 亮度自调节的无监督图像去雾与低光图像增强算法研究 Study on Unsupervised Image Dehazing and Low-light Image Enhancement Algorithms Based on Luminance Adjustment 计算机科学, 2023, 50(1): 123-130. https://doi.org/10.11896/jsjkx.211100058 |
|