Computer Science ›› 2026, Vol. 53 ›› Issue (6A): 250800006-7.doi: 10.11896/jsjkx.250800006

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Monocular Real-time 6D Pose Estimation for Weakly Textured Workpieces

FENG Yingbin1, KANG Xueshi1 , WANG Tianlong2   

  1. 1 School of Automation and Electrical Engineering,Shenyang Ligong University,Shenyang 110159,China
    2 Institute of Robotics and Intelligent Manufacturing Innovation,Chinese Academy of Sciences,Shenyang 110169,China
  • Online:2026-06-16 Published:2026-06-12
  • About author:FENG Yingbin,born in 1986,Ph.D,professor.His main research interests include robot environmental perception and modeling,and unmanned autonomous driving technology.
  • Supported by:
    Liaoning Province Science and Technology Program Project(2023JH2/10700006).

Abstract: Aiming at the problems of different sizes,occlusion stacking and lighting changes of weakly textured workpieces in industrial scenes,a monocular 6D pose estimation method RAAS-PVNet is proposed.The design resolution adaptive rectangular convolutional RARConv dynamically adjusts the size of the convolutional kernel and the number of sampling points,which solves the problem of the insufficient ability of the traditional convolutional structure in modeling multi-scale information.The angular distance collaborative weighted voting strategy AS is proposed,the vertical distance constraint of the direction vector extension line is introduced,and the credibility of each voting point is accurately measured by combining the continuous weight fusionmecha-nism,so that the voting results are focused on high-quality points and the anti-occlusion ability of the model is improved.Faced with the problem of lack of industrial part datasets in the field of pose estimation,a dataset production method combining real data and synthetic data in proportion is designed to construct the workpiece dataset 6DInd.Experiments show that the 2D Projection and ADD(-S) of RAAS-PVNet on 6DInd are increased by 10.22% and 10.26%,respectively,and have good robustness under occlusion and lighting changes,and the processing speed of 30 fps meets the real-time requirements.

Key words: 6D pose estimation, RAAS-PVNet, Resolution adaptive rectangular convolution, Angular spacing collaborative weighted voting strategy, 6DInd

CLC Number: 

  • TP183
[1] WEN B,YANG W,KAUTZ J,et al.Foundationpose:Unified 6d pose estimation and tracking of novel objects[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:17868-17879.
[2] WAN Q,NING S X,ZHONG H,et al.6D pose estimation and robotic arm grasping method for weakly rextured workpiece[J].Control Theory & Applications,2025,42(7):1443-1452.
[3] LIU J,SUN W,ZENG K,et al.Novel object 6d pose estimation with a single reference view[J].arXiv:2503.05578,2025.
[4] JIN L,ZHOU G,LIU Z,et al.IRPE:Instance-level reconstruction-based 6D pose estimator[J].Image and Vision Computing,2025,154:105340.
[5] WANG S L,YONGY,WU C R.6D Pose Estimation of Low Texture Industrial Parts Based on Pseudo-Siamese Neural Network[J].Acta Electronica Sinica,2023,51(1):192-201.
[6] FAN Z,ZHU Y,HE Y,et al.Deep learning on monocular object pose detection and tracking:A comprehensive overview[J].ACM Computing Surveys,2022,55(4):1-40.
[7] LABBÉ Y,CARPENTIER J,AUBRY M,et al.Cosypose:Con-sistent multi-view multi-object 6d pose estimation[C]//Computer Vision-ECCV 2020:16th European Conference,Part XVII 16.Springer International Publishing,2020:574-591.
[8] LI Z,STAMOS I.Depth-based 6dof object pose estimation using swin transformer[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS 2023).IEEE,2023:1185-1191.
[9] TEKIN B,SINHA S N,FUA P.Real-time seamless single shot6d object pose prediction[C]//Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Reconition.2018:292-301.
[10] WANG G,MANHARDT F,TOMBARI F,et al.Gdr-net:Ge-ometry-guided direct regression network for monocular 6d object pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:16611-16621.
[11] CAO T,ZHANG W,FU Y,et al.Dgecn++:A depth-guided edge convolutional network for end-to-end 6d pose estimation via attention mechanism[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,34(6):4214-4228.
[12] PENG S,LIU Y,HUANG Q,et al.Pvnet:Pixel-wise votingnetwork for 6dof pose estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4561-4570.
[13] LEPETIT V,MORENO-NOGUER F,FUA P.EPnP:An accurate O(n) solution to the P n P problem[J].International Journal of Computer Vision,2009,81:155-166.
[14] WANG X,ZHENG Z,SHAO J,et al.Adaptive RectangularConvolution for Remote Sensing Pansharpening[J].arXiv:2503.00467,2025.
[15] DENNINGER M,SUNDERMEYER M,WINKELBAUER D,et al.Blenderproc:Reducing the reality gap with photorealistic rendering[C]//16th Robotics:Science and Systems(RSS 2020).Workshops.2020.
[16] XIAO J,HAYS J,EHINGER K A,et al.Sun database:Large-scale scene recognition from abbey to zoo[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Re-cognition.IEEE,2010:3485-3492.
[17] SONG C,SONG J,HUANG Q.Hybridpose:6d object pose estimation under hybrid representations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:431-440.
[18] IWASE S,LIU X,KHIRODKAR R,et al.Repose:Fast 6d object pose refinement via deep texture rendering[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2021:3303-3312.
[19] LU Y,PEIS.DFW-PVNet:data field weighting based pixel-wise voting network for effective 6D pose estimation[J].Applied Intelligence,2025,55(4):240.
[1] YANG Geer, WANG Xin, SUN Wei, WANG Xinge, HU Zhongrui, MENG Wenjun, ZHANG Junqiang, WU Xinghui, LIU Jinshan, YAN Yuming. Survey on Positional Encoding Algorithms in Deep Learning [J]. Computer Science, 2026, 53(6A): 250300107-16.
[2] HUANG Haixin, HOU Guangshuai, HE Tianyu. SeguGAN:Research on Super-resolution Reconstruction of License Plate Images UtilizingGenerative Adversarial Networks [J]. Computer Science, 2026, 53(6A): 250600070-5.
[3] ZHANG Juling, ZHAO Yibing, WANG Sheng, XI Ning, SHE Wenkui. Robust Time Series Anomaly Detection Model Based on Multi-view Cross Filtering [J]. Computer Science, 2026, 53(6A): 250600105-10.
[4] ZHANG Xiaohan, YANG Fei, MA Jingyao, ZHAO Hanyue, ZHAO Xu. CA-MLNet:Dual-stream Memory and Channel Attention Based High-precision Trajectory Prediction Model [J]. Computer Science, 2026, 53(6A): 250600118-8.
[5] HUANG Haixin, HE Tianyu, HOU Guangshuai. Multi-layer Graph Convolutional Action Recognition Method Based on Topological Information [J]. Computer Science, 2026, 53(6A): 250600147-5.
[6] ZHANG Zihao, WU Zezhong. Optimization of HAN-based GNN-Transformer Collaborative Contrastive Learning Framework [J]. Computer Science, 2026, 53(6A): 250900103-8.
[7] XIE Congcong, AN Yuxuan, WANG Di, LUO Xuemei, WANG Yifeng. From Recognition to Generation:Natural Language Expression of Student Attention in OnlineLearning Contexts [J]. Computer Science, 2026, 53(6): 69-76.
[8] GAO Tai, REN Yanzhang, WANG Huiqing, LI Ying, WANG Bin. KGMamba:Gene Regulatory Network Prediction Model Based on Kolmogorov-Arnold Network Optimizing Graph Convolutional Network and Mamba [J]. Computer Science, 2026, 53(4): 101-111.
[9] GU Bokai, LIU Dun, SUN Yang. STWD-DLFRD:Multi-granularity Fake Review Detection via Sequential Three-way Decisions and Deep Learning [J]. Computer Science, 2026, 53(4): 188-196.
[10] FAN Xiaoling, DAI Shilong, XIAO Min, SUN Yonghui, XU Fengyu. Study on Influence Mechanism and Control of Cross-diffusion on Rumor Spreading Pattern [J]. Computer Science, 2026, 53(4): 197-207.
[11] ZHENG Yi, JIA Xinghao, ZHANG Junwen, REN Shuang. Image Classification Based on Hybrid Quantum-Classical Long-Short Range Feature Extension Network [J]. Computer Science, 2026, 53(4): 277-283.
[12] WANG Shaodong, LI Liujun, LI Rui, SU Zhongzhen, LU Yao. Tensor-based Multimodal Fusion Technique to Diagnose Microvascular Invasion [J]. Computer Science, 2026, 53(4): 284-290.
[13] ZHAO Haihua, TANG Rui, MO Xian. Review of Methods and Applications of Graph Diffusion Models [J]. Computer Science, 2026, 53(3): 115-128.
[14] WANG Yiming, JIAO Min, ZHAO Suyun, CHEN Hong, LI Cuiping. Prompt-conditioned Representation Learning with Diffusion Models for Semi-supervised Clustering [J]. Computer Science, 2026, 53(3): 158-165.
[15] WAN Shenghua, XU Xingye, GAN Le, ZHAN Dechuan. Pre-training World Models from Videos with Generated Actions by Multi-modal Large Models [J]. Computer Science, 2026, 53(1): 51-57.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!