Computer Science ›› 2025, Vol. 52 ›› Issue (10): 159-167.doi: 10.11896/jsjkx.241000066

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Immediate Generation Algorithm of High-fidelity Head Avatars Based on NeRF

SHENG Xiaomeng, ZHAO Junli, WANG Guodong, WANG Yang   

  1. College of Computer Science & Technology,Qingdao University,Qingdao,Shandong 266071,China
  • Received:2024-10-14 Revised:2024-12-09 Online:2025-10-15 Published:2025-10-14
  • About author:SHENG Xiaomeng,born in 2000,postgraduate.Her main research interests include neural radiance fields and 3D human face reconstruction.
    WANG Guodong,born in 1980,Ph.D, professor,is a member of CCF(No.16234M).His main research interests include computer graphics and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China (62172247) and Qingdao Natural Science Foundation(23-2-1-163-zyyd-jch).

Abstract: To resolve the escalating need for quickly generating high-precision digital humans in fields such as digital entertainment,virtual reality,and the metaverse,this paper proposes a novel method for rapidly generating a high-precision face model based on monocular RGB videos.Meanwhile,a new framework dedicated to precise modeling of the facial and neck regions is constructed.In particular,the proposed framework integrates neural primitives into a parameterized model of the head and neck,utilizing Head-And-neCK(Hereinafter referred to as HACK) as a superior alternative to the widely adopted Face Latent Animated Mesh Estimator(Hereinafter referred to as FLAME).This substitution markedly enhances the precision and efficiency of 3D facial reconstruction.Additionally,the proposed method has designed a real-time adaptive neural radiance field that significantly accelerates the training and reconstruction processes.By introducing a multi-resolution hash grid and employing the nearest triangle search for deformation gradient calculation within the deformation space,the proposed method achieves rapid reconstruction of high-fidelity head and neck models within minutes.Extensive quantitative and qualitative evaluations demonstrate that the proposed model exhibits notable improvements in both rendering quality and training time compared to existing state-of-the-art methods.

Key words: Head avatar,Neural radiance field,High-fidelity model,HACK model,Deep learning

CLC Number: 

  • TP391.9
[1]HE G X,ZHU B,XIE B,et al.Progress in Novel View Synthesis Using Neural Radiance Fields[J].Laser & Optoelectronics Progress,2024,61(12):1200005.
[2]LIU X N,CHEN C Y,HU X J,et al.Neural radiation field virtual viewpoint picture synthesis with depth information supervision[J].Chinese Journal of Image and Graphics,2024,29(7):2035-2045.
[3]LI J Y,CHENG L C,HE J X,et al.Research status and prospects of neural radiation fields[J].Journal of Computer-Aided Design and Graphics,2024,36(7):995-1013.
[4]BOUAZIZ S,WANG Y,PAULY M.Online modeling for real-time facial animation[J].ACM Transactions on Graphics,2013,32(4):1-10.
[5]CAO C,SIMON T,KIM J K,et al.Authentic volumetric avatars from a phone scan[J].ACM Transaction on Graphics,2022,41(4):1-19.
[6]CHEN A,XU Z,GEIGER A,et al.Tensorf:Tensorial radiance fields[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:333-350.
[7]CHEN W Z,GAO J,LING H,et al.Learning to predict 3d objects with an interpolation-based differentiable renderer[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:9609-9619.
[8]JI Y,YU Y Q.Optimization algorithm for speech facial videogeneration based on dense convolutional generative adversarial networks and keyframes[J].Journal of Jilin University(Engineering and Technology Edition),2025,55(3):986-992.
[9]ZHANG X,KU S P.Facial super-resolution reconstructionmethod based on generative adversarial networks[J].Journal of Jilin University(Engineering and Technology Edition),2025,55(1):333-338.
[10]CHOI B,EOM H,MOUSCADET B,et al.Animatomy:An animator-centric,anatomically inspired system for 3d facial mode-ling,animation and transfer[C]//SIGGRAPH Asia 2022 Confe-rence Papers.2022:1-9.
[11]DANĚČEK R,BLACK M J,BOLKART T.Emoca:Emotion driven monocular face capture and animation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:20311-20322.
[12]LATTAS A,MOSCHOGLOU S,PLOUMPIS S,et al.Avatar-me++:Facial shape and brdf inference with photorealistic rende-ring-aware gans[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(12):9269-9284.
[13]LIU L,GU J,ZAW LIN K,et al.Neural sparse voxel fields[J].Advances in Neural Information Processing Systems,2020,33:15651-15663.
[14]LOPER M,MAHMOOD N,ROMERO J,et al.SMPL:Askinned multi-person linear model[J].ACM Transaction on Graphics,2015,34(6):1-16.
[15]BARRON J T,MILDENHALL B,TANCIK M,et al.Mip-neRF:A multiscale representation for anti-aliasing neural radiance fields[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:5855-5864.
[16]BARRON J T,MILDENHALL B,VERBIN D,et al.Mip-nerf 360:Unbounded anti-aliased neural radiance fields[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:5470-5479.
[17]ZHANG K,RIEGLER G,SNAVELY N,et al.NeRF++:Analyzing and improving neural radiance fields[J].arXiv:2010.07492,2020.
[18]MILDENHALL B,HEDMAN P,MARTIN-BRUALLA R,et al.NeRF in the dark:High dynamic range view synthesis from noisy raw images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:16190-16199.
[19]ZHANG L,ZHAO Z,CONG X,et al.Hack:Learning a parametric head and neck model for high-fidelity animation[J].ACM Transactions on Graphics,2023,42(4):1-20.
[20]MÜLLER T,EVANS A,SCHIED C,et al.Instant neuralgraphics primitives with a multiresolution hash encoding[J].ACM Transactions on Graphics,2022,41(4):1-15.
[21]LI T,BOLKART T,BLACK M J,et al.Learning a model of facial shape and expression from 4D scans[J].ACM Transactions on Graphics,2017,36(6):1-17.
[22]MILDENHALL B,SRINIVASAN P P,TANCIK M,et al.Nerf:Representing scenes as neural radiance fields for view synthesis[J].Communications of the ACM,2021,65(1):99-106.
[23]BLANZ V,VETTER T.A morphable model for the synthesis of3D faces[C]//Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques(SIGGRAPH'99).1999:187-194.
[24]KIM H,GARRIDO P,TEWARI A,et al.Deep video portraits[J].ACM transactions on graphics,2018,37(4):1-14.
[25]ZHANG J W,ZHANG H X,LI S H,et al.3D Reconstruction of Human Head Based on TE-NeuS[J].Software Engineering,2024,27(7):56-60.
[26]WANG Y J,LI S Z,HAN J Y,et al.Single-image three-dimensional face reconstruction based on convolutional neural network [J].Sensors and Microsystems,2021,40(6):52-56.
[27]DONG J Z,ZUO W M.Research on self-encoding voxel network for 3D face reconstruction [J].Intelligent Computers and Applications,2020,10(6):303-309.
[28]LI T,BOLKART T,BLACK M J,et al.Learning a model of facial shape and expression from4D scans[J].ACM Transaction on Graphics,2017,36(6):1-17.
[29]LOPER M,MAHMOOD N,ROMERO J,et al.SMPL:Askinned multi-person linear model[J].Transaction on Graphics,2015,34(6):1-16.
[30]CHOI B,EOM H,MOUSCADET B,et al.Animatomy:An animator-centric,anatomically inspired system for 3d facial mode-ling,animation and transfer[C]//SIGGRAPH Asia 2022 Confe-rence Papers.2022:1-9.
[31]WU S P,MA J S,SHE J F.An Implicit Representation - Based Method for Instant Real-Scene 3D Reconstruction and Neural Rendering[J].Science of Surveying and Mapping,2024,49(4):147-158.
[32]LI R,BLADIN K,ZHAO Y,et al.Learning formation of physically-based face attributes[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:3410-3419.
[33]LAINE S,KARRAS T,AILA T,et al.Production-level facialperformance capture using deep convolutional neural networks[C]//Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation.2017:1-10.
[34]CALISKAN A,KICANAOGLU B,KIM H.PAV:Personalized Head Avatar from Unstructured Video Collection[J].arXiv:2407.21047,2024.
[35]KABADAYI B,ZIELONKA W,BHATNAGAR B L,et al.Gan-avatar:Controllable personalized gan-based human head avatar[C]//2024 International Conference on 3D Vision(3DV).IEEE,2024:882-892.
[36]RAI A,GUPTA H,PANDEY A,et al.Towards realistic generative 3d face models[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2024:3738-3748.
[37]BARRON J T,MILDENHALL B,TANCIK M,et al.Mip-nerf:A multiscale representation for anti-aliasing neural radiance fields[C]//Proceedings of the IEEE/CVF International Confe-rence on Computer Vision.2021:5855-5864.
[38]LIU L,GU J,ZAW LIN K,et al.Neural sparse voxel fields[J].Advances in Neural Information Processing Systems,2020,33:15651-15663.
[39]GAFNI G,THIES J,ZOLLHOFER M,et al.Dynamic neural radiance fields for monocular 4d facial avatar reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:8649-8658.
[40]CHEN A,XU Z,GEIGER A,et al.Tensorf:Tensorial radiance fields[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:333-350.
[41]MÜLLER T,EVANS A,SCHIED C,et al.Instant neuralgraphics primitives with a multiresolution hash encoding[J].ACM Transactions on Graphics,2022,41(4):1-15.
[42]TESCHNER M,HEIDELBERGER B,MÜLLER M,et al.Optimized spatial hashing for collision detection of deformable objects[C]//VMV.2003:47-54.
[43]WEI Y,LIU S,RAO Y,et al.Nerfingmvs:Guided optimization of neural radiance fields for indoor multi-view stereo[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:5610-5619.
[44]ROESSLE B,BARRON J T,MILDENHALL B,et al.Densedepth priors for neural radiance fields from sparse input views[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:12892-12901.
[45]DENG K,LIU A,ZHU J Y,et al.Depth-supervised nerf:Fewer views and faster training for free[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:12882-12891.
[46]FAN T,YANG H,YIN W,et al.Multi-scale view synthesis based on neural radiance fields[J].Journal of Graphics,2023,44(6):1140-1148.
[47]XU Y,CHEN B,LI Z,et al.Gaussian head avatar:Ultra high-fidelity head avatar via dynamic gaussians[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:1931-1941.
[48]DENG K,LIU A,ZHU J Y,et al.Depth-supervised nerf:Fewer views and faster training for free[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:12882-12891.
[49]THIES J,ZOLLHOFER M,STAMMINGER M,et al.Face2face:Real-time face capture and reenactment of RGB videos[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2387-2395.
[50]ZHENG Y,ABREVAYA V F,BÜHLER M C,et al.Im avatar:Implicit morphable head avatars from videos[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:13545-13555.
[51]GRASSAL P W,PRINZLER M,LEISTNER T,et al.Neuralhead avatars from monocular rgb videos[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:18653-18664.
[52]PENG S,DONG J,WANG Q,et al.Animatable neural radiance fields for modeling dynamic human bodies[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:14314-14323.
[53]CHEN X,JIANG T,SONG J,et al.Fast-SNARF:A fast deformer for articulated neural fields[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(10):11796-11809.
[54]CHEN X,ZHENG Y,BLACK M J,et al.Snarf:Differentiable forward skinning for animating non-rigid neural implicit shapes[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:11594-11604.
[55]KIM H,GARRIDO P,TEWARI A,et al.Deep video portraits[J].ACM transactions on graphics,2018,37(4):1-14.
[56]PARK K,SINHA U,BARRON J T,et al.Nerfies:Deformable neural radiance fields[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:5865-5874.
[57]AXLER S,BOURDON P,WADE R.Harmonic function theory[M].Springer Science & Business Media,2013.
[58]EKMAN P,FRIESEN W V.Facial action coding system[J/OL].http://doi.org/10.1037/t27734-000.
[59]CLARK J H.Hierarchical geometric models for visible surface algorithms[J].Communications of the ACM,1976,19(10):547-554.
[60]ZIELONKA W,BOLKART T,THIES J.Instant volumetrichead avatars[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:4574-4584.
[61]HUBER P J.Robust estimation of a location parameter[M]//Breakthroughs in statistics:Methodology and distribution.New York:Springer,1992:492-518.
[62]YU C,GAO C,WANG J,et al.Bisenet v2:Bilateral networkwith guided aggregation for real-time semantic segmentation[J].International Journal of Computer Vision,2021,129:3051-3068.
[63]MÜLLER T.tiny-cuda-nn[J/OL].https://github.com/NV-labs/tiny-cuda-nn.
[64]LIN S,YANG L,SALEEMI I,et al.Robust high-resolution vi-deo matting with temporal guidance[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2022:238-247.
[65]ZIELONKA W,BOLKART T,THIES J.Towards metrical reconstruction of human faces[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:250-269.
[66]LUGARESI C,TANG J,NASH H,et al.Mediapipe:A framework for building perception pipelines[J].arXiv:1906.08172,2019.
[67]ZHANG R,ISOLA P,EFROS A A,et al.The unreasonable effectiveness of deep features as a perceptual metric[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:586-595.
[1] FU Chao, YU Liangju, CHANG Wenjun. Selective Ensemble Learning Method for Optimal Similarity Based on LLaMa3 and Choquet Integrals [J]. Computer Science, 2025, 52(9): 80-87.
[2] ZHANG Shiju, GUO Chaoyang, WU Chengliang, WU Lingjun, YANG Fengyu. Text Clustering Approach Based on Key Semantic Driven and Contrastive Learning [J]. Computer Science, 2025, 52(8): 171-179.
[3] XU Dan, WANG Jiangtao. Design of Autonomous Decision for Trajectory Optimization of Intelligent Morphing Aircraft [J]. Computer Science, 2025, 52(6A): 240600068-7.
[4] LI Mingjie, HU Yi, YI Zhengming. Flame Image Enhancement with Few Samples Based on Style Weight Modulation Technique [J]. Computer Science, 2025, 52(6A): 240500129-7.
[5] WANG Haiyong, DING Gufei. ECG Signal Denoising Method Based on Stationary Wavelet Transform with Hyperbolic TangentThreshold Function [J]. Computer Science, 2025, 52(5): 179-186.
[6] LI Hao, YE Shuai, SHI Peiteng, SHI Zhijiang. Survey of Combat Parallel Simulation Technology [J]. Computer Science, 2024, 51(11A): 240100127-7.
[7] LIU Yi, QI Jie. IRRT*-APF Path Planning Algorithm Considering Kinematic Constraints of Unmanned Surface Vehicle [J]. Computer Science, 2024, 51(9): 290-298.
[8] LEI Chao, LIU Jiang, SONG Jiawen. Time Cost Model and Optimal Configuration Method for GPU Parallel Computation of Matrix Multiplication [J]. Computer Science, 2024, 51(6A): 230300200-8.
[9] DENG Hannian, ZHOU Jie, YANG Bo, YI Lili, FU Guang, ZHOU Peng. Modeling and Analysis of Implementation Process for Civil Aircraft Certification Test Flight Based on Stochastic Petri Net [J]. Computer Science, 2024, 51(6A): 230700050-6.
[10] WANG Yue, REN Jun, MA Fei, WU Long. Study on Object Image Structure Algorithm of 3D Human Body Measurement System Based on Personalized Intelligent Customization of Clothing [J]. Computer Science, 2024, 51(6A): 230600233-5.
[11] WEN Haolin, DI Peng, CHEN Tong. Design of Ship Mission Reliability Simulation System Based on Agent [J]. Computer Science, 2023, 50(11A): 220800272-7.
[12] LI Yunzhe, DONG Peng, YE Weimin, WEN Haolin. Simulation of Equipment Procurement Model Based on Dynamic Evolutionary Game [J]. Computer Science, 2023, 50(11A): 220900051-10.
[13] HUANG Renxian, LUO Liang, YANG Meng, LIU Weiqin. Multi-ship Coordinated Collision Avoidance Decision Based on Improved Twin Delayed Deep Deterministic Policy Gradient [J]. Computer Science, 2023, 50(11): 269-281.
[14] YU Kai, SU Tianrui. Modeling and Simulation of Point-to-Point Propagation of False Information Based on Information Risk Perception [J]. Computer Science, 2023, 50(7): 376-385.
[15] LI Yang, WANG Shi, ZHU Junwu, LIANG Mingxuan, GAO Xiang, JIAO Zhixiang. Summarization of Aspect-level Sentiment Analysis [J]. Computer Science, 2023, 50(6A): 220400077-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!