Computer Science ›› 2026, Vol. 53 ›› Issue (6A): 250300109-9.doi: 10.11896/jsjkx.250300109

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Occlusion Head Pose Estimation Algorithm Based on Riemann Optimization

WANG Baohui, TAN Yingjie , CHEN Jixuan   

  1. College of Software,Beihang University,Beijing 100083,China
  • Online:2026-06-16 Published:2026-06-12
  • About author:WANG Baohui,born in 1973,senior engineer,master's supervisor.His main research interests include network security,big data,artificial intelligence,etc.

Abstract: Human head pose estimation is an important task in the field of deep learning,especially in vehicle-mounted assisted driving systems.As a key means of detecting fatigue driving,it has broad application prospects.However,in practical applications,facial occlusion often results in the loss of key feature points,seriously reducing model accuracy.In response to this problem,this paper proposes the 6DRepLKNet-RGO algorithm.Based on 6DRepNet,this algorithm optimizes the network structure and enhances feature extraction capabilities through structural re-parameterization design.At the same time,combined with the Riemannian manifold gradient optimization strategy,the learning process of the three-dimensional pose representation is optimized and the training error is reduced.In order to further improve the model's pose estimation accuracy under occlusion,a data enhancement method of random erasure is added.Experiments show that 6DRepLKNet-RGO reduces errors by more than 5% compared to 6DRepNet on public data sets such as BIWI and AFLW2000,and surpasses other advanced models in terms of the MAE metric,verifying its effectiveness.

Key words: Head pose estimation, Structural re-parameterization, Data augmentation, Gradient optimization, Deep learning

CLC Number: 

  • TP312
[1] LI D H,LIU Q,YUAN W,et al.Relationship between fatigued driving and traffic accidents[J].Journal of Traffic and Transportation Engineering,2010(4):104-109.
[2] SHERRAH J,GONG S.Face distributions in similarity space under varying head pose[J].Image and Vision Computing,2001,19(12):807-819.
[3] TORKI M,ELGAMMAL A.Regression from local features for viewpoint and pose estimation[C]//2011 International Conference on Computer Vision.IEEE,2011:2603-2610.
[4] DROUARD V,HORAUD R,DELEFORGE A,et al.Robusthead-pose estimation based on partially-latent mixture of linear regressions[J].IEEE Transactions on Image Processing,2017,26(3):1428-1440.
[5] LU J,TAN Y.Ordinary preserving manifold analysis for human age and head pose estimation[J].IEEE Transactions on Human-Machine Systems,2013,43(2):249-258.
[6] NIKOLAIDIS A,PITAS I.Facial feature extraction and posedetermination[J].Pattern Recogition,2000,33(11):1783-1791.
[7] NARAYANAN A,KAIMAL R M,BIJLANI K.Estimation of driver head yaw angle using a geometric model[J].IEEE Transactions on Intelligent Transportation Systems,2016,17(12):3446-3460.
[8] XIAO J,MORIYAMA T,KAASADE T,et al.Robust full-motion recovery of head by dynamic templates and re-registration techniques[J].International Journal of Imaging Systems and Technology,2003,13(1):85-94.
[9] ZHAO G,CHEN L,SONG J,et al.Large head movement tracking using sift-based registration[C]//15th ACM International Conference on Multimedia.ACM,2007:807-810.
[10] EL ABOUDI N,BENHLIMA L.Review on wrapper feature selection approaches[C]//Proceedings of the 2016 International Conference on Engineering & MIS(ICEMIS).IEEE,2016:1-5.
[11] VALLE R,BUENAPOSADA J M,BAUMELA L.Multi-taskhead pose estimation in-the-wild[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(8):2874-2881.
[12] HEMPEL T,ABDELRAHMAN A A,AL-HAMADI A.6d rotation representation for unconstrained head pose estimation[C]//IEEE International Conference on Image Processing(ICIP 2022).IEEE,2022:2496-2500.
[13] THAI C,TRAN V,BUI M,et al.An Effective Deep Network for Head Pose Estimation without Keypoints[C]//IEEE International Conference on Image Processing(ICIP).Bordeaux:IEEE,2022:2591-2595.
[14] WANG J,JIN Y C,GUO P,et al.A Survey of Camera Pose Estimation Methods Based on Deep Learning[J].Computer Engineering and Applications,2023,59(7):1-14.
[15] KAZEMI V,SULLIVAN J.One millisecond face alignment with an ensemble of regression trees[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:1867-1874.
[16] BULAT A,TZIMIROPOULOS G.How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 30,000 3d facial landmarks) [C]//Proceedings of the IEEE International Conference on Computer Vision.2017:1021-1030.
[17] XIA J,PEI D,WANG Q Z,et al.Face recognition based on local adaptive ternary derivative pattern coupled with Gabor feature[J].Laser & Optoelectronics Progress,2016,53(11):111004.
[18] RUIZ N,CHONG E,REHG J M.Fine-grained head pose estimation without keypoints[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2018:2074-2083.
[19] YANG T Y,CHEN Y T,LIN Y Y,et al.FSA-Net:Learning fine-grained structure aggregation for head pose estimation from a single image[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:1087-1096.
[20] ZHOU Y,GREGSON J.WHENet:Real-time fine-grained estimation for wide range head pose[J].arXiv:2005.10353,2020.
[21] DAI H H,TAN S X,ZHANG W.A Survey of Head Pose Estimation Methods[J].Modern Computer,2021(7):130-134.
[22] DING X,ZHANG X,MA N,et al.RepVGG:Making vgg-style convnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13733-13742.
[23] VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].Advances in Neural Information Processing Systems,2017,30.
[24] DING X,ZHANG X,HAN J,et al.Scaling up your kernels to 31x31:Revisiting large kernel design in cnns[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11963-11975.
[25] HU M,FENG J,HUA J,et al.Online convolutional re-parameterization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:568-577.
[26] DING X,CHEN H,ZHANG X,et al.Re-parameterizing youroptimizers rather than architectures[J].arXiv:2205.15242,2022.
[27] CHEN J,YIN Y,BIRDAL T,et al.Projective manifold gradient layer for deep rotation regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:6646-6655.
[28] ZHONG Z,ZHENG L,KANG G,et al.Random erasing dataaugmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:13001-13008.
[29] ZHOU Y,BARNES C,LU J,et al.On the continuity of rotation representations in neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5745-5753.
[30] ZHU X,LEI Z,LIU X,et al.Face alignment across large poses:A 3d solution[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.2016:146-155.
[31] ZHU X,LEI Z,YAN J,et al.High-fidelity pose and expression normalization for face recognition in the wild[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.2015:787-796.
[32] FANELLI G,DANTONE M,GALL J,et al.Random forests for real time 3d face analysis[J].International Journal of Computer Vision,2013,101:437-458.
[33] ALBIERO V,CHEN X,YIN X,et al.img2pose:Face alignment and detection via 6dof,face pose estimation[C]//Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition.2021:7617-7627.
[34] AGHLI N,RIBEIRO E.A Data-Driven Approach to Improve 3D Head-Pose Estimation[C]//Advances in Visual Computing:16th International Symposium(ISVC 2021)..Springer International Publishing,2021:546-558.
[35] ZHANG C,LIU H,DENG Y,et al.Tokenhpe:Learning orientation tokens for efficient head pose estimation via transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:8897-8906.
[36] GAO Q,DING B R,DONG C L,et al.Head Pose Estimation Algorithm for Real-Time Applications[J].Journal of Shenyang University(Natural Science Edition),2025,37(1):25-33,93,99.
[37] WANG Y,LIU H,FENG Y,et al.HeadDiff:Exploring Rotation Uncertainty With Diffusion Models for Head Pose Estimation[J].IEEE Transactions on Image Processing,2024(33):1868-1882.
[38] ZHANG J,LI X,LI J,et al.Rethinking Mobile Block for Efficient Attention-based Models [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:1389-1400.
[39] DING X,CHEN H,ZHANG X,et al.Repmlpnet:Hierarchical vision mlp with re-parameterized locality[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:578-587.
[40] MEHTA S,RASTEGARI M.Mobilevit:light-weight,general-purpose,and mobile-friendly vision transformer[J].arXiv:2110.02178,2021.
[41] HATAMIZADEH A,HEINRICH G,YIN H,et al.FasterViT:Fast Vision Transformers with Hierarchical Attention[J].ar-Xiv:2306.06189,2023.
[42] VASU P K A,GABRIEL J,ZHU J,et al.FastViT:A Fast Hybrid Vision Transformer using Structural Reparameterization[J].arXiv:2303.14189,2023.
[1] CHEN Di, YIN Jibin. Dynamic Adjustment Technology of Eye Movement Input Based on TCN-AttnRNN Model [J]. Computer Science, 2026, 53(6A): 250300095-7.
[2] CHU Chunyu, JIANG Feilong. Water Meter Reading Recognition Based on Deep Learning and Prior Correction [J]. Computer Science, 2026, 53(6A): 250300143-7.
[3] WU Xiaoxiao, WU Xinglong. Prenatal Diagnosis of Fetal Cerebellum Based on Brain Anatomical Structures [J]. Computer Science, 2026, 53(6A): 250400049-7.
[4] ZHANG Xiaozhu, CHEN Hongyou, QU Lingfeng, WANG Yuechenjia, TIAN Baodan, FAN Yong. Carbon Emission Prediction Algorithm Based on TransLSTM-GAN Model [J]. Computer Science, 2026, 53(6A): 250400146-11.
[5] FU Yue, SHI Wei. Social Text MBTI Personality Feature Recognition Method Based on Data Fusion and Deep Learning [J]. Computer Science, 2026, 53(6A): 250500101-8.
[6] SHEN Yingchun, FENG Xiaohan, LI Qian. Accurate Recognition of Dialect Based on CTC-Conformer Model [J]. Computer Science, 2026, 53(6A): 250600112-8.
[7] SU Ye, XU Xin, ZHAO Longlong, LI Xiaoli, CHEN Pan, CHEN Jinsong. LitchiNet:Lightweight Litchi Variety Recognition Network with Fused Multi-scale Gated Attention and Class Imbalance Awareness [J]. Computer Science, 2026, 53(6A): 250600127-8.
[8] LIU Zixuan, TANG Xiaoyong. PID-Dynamic LSTM Generation Model for MCU Driver Code Based on Dynamically-tuned Cross-entropy Loss [J]. Computer Science, 2026, 53(6A): 250800005-9.
[9] LI Qin, WU Siyuan, YANG Haoyuan, DU Qin, LING Xu, XIAO Guoqing. Conjugate Gradient Preconditioner Adaptive Selection Algorithm via Deep Learning [J]. Computer Science, 2026, 53(6A): 250900126-6.
[10] LI Siyu, QIAN Wenhua. HCKD:Lightweight Skin Lesion Classification Method Based on Dermoscopic Images [J]. Computer Science, 2026, 53(6A): 250600143-9.
[11] CHEN Nuo, ZHAO Peng, HUAN Haisheng. Review of Small Object Detection Based on Deep Learning [J]. Computer Science, 2026, 53(6A): 250700022-9.
[12] WANG Yipin, CAI Chenghuan, XU Jiabin, ZHOU Xuegong, ZHANG Fengzhe, CAO Wei, ZHANG Fan, YU Xinsheng. Study on Compilation Technology of Neural Network Accelerator Based on RISC-V InstructionExtension [J]. Computer Science, 2026, 53(6): 128-136.
[13] LI Xiuying, CHEN Xuesong, LI Haoze, LIAO Hongwei, HAN Jiameng, DUAN Xiaoyi. MambaCS:Mamba-based Image Compressed Sensing Algorithm [J]. Computer Science, 2026, 53(6): 232-241.
[14] MA Ning, CHANG Xia, YUAN Lingyu. Pansharpening Method Based on Double-side Guided Filtering and Multi-feature Recalibration [J]. Computer Science, 2026, 53(6): 270-280.
[15] CHEN Yuansheng, CHEN Shunjue, MO Xuan, WU Weigang, LI Jialun. Deep Learning Training Time Prediction Algorithm Integrating Multi-dimensional Operator Features [J]. Computer Science, 2026, 53(5): 129-136.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!