Computer Science ›› 2025, Vol. 52 ›› Issue (3): 58-67.doi: 10.11896/jsjkx.240300030
• 3D Vision and Metaverse • Previous Articles Next Articles
WANG Xingbo, ZHANG Hao, GAO Hao, ZHAI Mingliang, XIE Jiucheng
CLC Number:
| [1]CHUNG J S,JAMALUDIN A,ZISSERMAN A.You said that?[J].arXiv:1705.02966,2017. [2]CRESWELL A,WHITE T,DUMOULIN V,et al.Generativeadversarial networks:An overview[J].IEEE signal processing magazine,2018,35(1):53-65. [3]WANG Q Q,ZHANG J L.Face Pose and Expression Correction Based on 3D Morphable Model[J].Computer Science,2019,46(6):263-269. [4]TANG Y X,WANG B J.Research Progress of Face Editing Based on Deep Generative Model[J].Computer Science,2022,49(2):51-61. [5]MILDENHALL B,SRINIVASAN P P,TANCIK M,et al.Nerf:Representing scenes as neural radiance fields for view synthesis[J].Communications of the ACM,2021,65(1):99-106. [6]GUO Y,CHEN K,LIANG S,et al.Ad-nerf:Audio driven neural radiance fields for talking head synthesis[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition.Montreal:IEEE,2021:5784-5794. [7]XIE Z F,ZHENG J H,WANG J,et al.Speech-Driven Facial Ree-nactment Guided by Structured Latent Codes in NeRF[J].Journal of Computer-Aided Design and Graphics,2023,41(3):1003-1015. [8]ZHENG B W,DONG J W,WU L T,et al.A Method and System for Generating Virtual Anchors Based on Neural Radiance Fields and Hidden Attributes:CN-202311094348.7[P].2023-12-05. [9]MULLER T,EVANS A,SCHIED C,et al.Instant neuralgraphics primitives with a multiresolution hash encoding[J].ACM Transactions on Graphics(ToG),2022,41(4):1-15. [10]TANG J,WANG K,ZHOU H,et al.Real-time neural radiance talking portrait synthesis via audio-spatial decomposition[J].arXiv:2211.12368,2022. [11]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//Proceedings of the Medical Image Computing and Computer Assisted Intervention.Munich:MICCAI,2015:234-241. [12]GU K,ZHOU Y,HUANG T.Flnet:Landmark driven fetching and learning network for faithful talking facial animation synthesis[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI,2020:10861-10868. [13]ZHANG Z,LI L,DING Y,et al.Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:3661-3670. [14]THIES J,ELGHARIB M,TEWARI A,et al.Neural voice puppetry:Audio-driven facial reenactment[C]//Proceedings of the European Conference on Computer Vision.ECCV,2020:716-731. [15]BLANZ V,VETTER T.A morphable model for the synthesis of3D faces[C]//Proceedings of the Seminal 26th Annual Confe-rence on Computer Graphics and Interactive Techniques.New York:ACM,1999:187-194. [16]LIU X,XU Y,WU Q,et al.Semantic-aware implicit neural au-dio-driven video portrait generation[C]//Proceedings of the European Conference on Computer Vision.Switzerland:ECCV,2022:106-125. [17]SHEN S,LI W,ZHU Z,et al.Learning dynamic facial radiance fields for few-shot talking headsynthesis[C]//Proceedings of the European Conference on Computer Vision.Switzerland:ECCV,2022:666-682. [18]YAO S,ZHONG R Z,YAN Y,et al.DFA-NeRF:Personalized talking head generation via disentangled face attributes neural rendering[J].arXiv:2201.00791,2022. [19]CHAN E R,LIN C Z,CHAN M A,et al.Efficient geometry-aware 3D generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans:IEEE,2022:16123-16133. [20]GUO M H,LIU Z N,MU T J,et al.Beyond self-attention:External attention using two linear layers for visual tasks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(5):5436-5447. [21]LI J,ZHANG J,BAI X,et al.Efficient region-aware neural ra-diance fields for high-fidelity talking portrait synthesis[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Paris:IEEE,2023:7568-7578. [22]ZHOU H,SUN Y,WU W,et al.Pose-controllable talking face generation by implicitly modularized audio-visual representation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:IEEE,2021:4176-4186. [23]ZHANG Z,HU Z,DENG W,et al.DINet:Deformation inpain-ting network for realistic face visually dubbing on high resolution video[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Washington D.C:AAAI,2023:3543-3551. [24]ZHANG R,ISOLA P,EFROS A,et al.The unreasonable effectiveness of deep features as a perceptual metric[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:586-595. [25]HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.Gans trained by a two time-scale update rule converge to a local hash equilibrium[J].Advances in Neural Information Processing Systems,2017,30(4):6626-6637. [26]CHEN L,LI Z,MADDOX R K,et al.Lip movements generation at a glance[C]//Proceedings of the European Conference on Computer Vision.Salt Lake City:ECCV,2018:520-535. [27]GUAN J,ZHANG Z,ZHOU H,et al.StyleSync:High-fidelitygeneralized and personalized lip sync in style-based generator[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Vancouver:IEEE,2023:1505-1515. [28]CHUNG J S,ZISSERMAN A.Lip reading in the wild[C]//Proceedings of the Computer Vision Asian Conference on Computer Vision.Waikoloa:IEEE,2017:87-103. [29]BALTRUSAITIS T,ROBINSON P,MORENCY L P.Open-face:An open source facial behavior analysis toolkit[C]//Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision(WACV).Waikoloa:IEEE,2016:1-10. [30]SUWAJANAKORN S,SEITZ S M,KEMELMACHER S I.Synthesizing Obama:Learning lips sync from audio[J].ACM Transactions on Graphics(TOG),2017,36(4):1-13. |
| [1] | CHEN Qian, CHENG Kaixuan, GUO Xin, ZHANG Xiaoxia, WANG Suge, LI Yanhong. Bidirectional Prompt-Tuning for Event Argument Extraction with Topic and Entity Embeddings [J]. Computer Science, 2026, 53(1): 278-284. |
| [2] | LYU Jinggang, GAO Shuo, LI Yuzhi, ZHOU Jin. Facial Expression Recognition with Channel Attention Guided Global-Local Semantic Cooperation [J]. Computer Science, 2026, 53(1): 195-205. |
| [3] | FAN Jiabin, WANG Baohui, CHEN Jixuan. Method for Symbol Detection in Substation Layout Diagrams Based on Text-Image MultimodalFusion [J]. Computer Science, 2026, 53(1): 206-215. |
| [4] | WANG Haoyan, LI Chongshou, LI Tianrui. Reinforcement Learning Method for Solving Flexible Job Shop Scheduling Problem Based onDouble Layer Attention Network [J]. Computer Science, 2026, 53(1): 231-240. |
| [5] | PENG Jiao, HE Yue, SHANG Xiaoran, HU Saier, ZHANG Bo, CHANG Yongjuan, OU Zhonghong, LU Yanyan, JIANG dan, LIU Yaduo. Text-Dynamic Image Cross-modal Retrieval Algorithm Based on Progressive Prototype Matching [J]. Computer Science, 2025, 52(9): 276-281. |
| [6] | GAO Long, LI Yang, WANG Suge. Sentiment Classification Method Based on Stepwise Cooperative Fusion Representation [J]. Computer Science, 2025, 52(9): 313-319. |
| [7] | LIU Jian, YAO Renyuan, GAO Nan, LIANG Ronghua, CHEN Peng. VSRI:Visual Semantic Relational Interactor for Image Caption [J]. Computer Science, 2025, 52(8): 222-231. |
| [8] | LIU Yajun, JI Qingge. Pedestrian Trajectory Prediction Based on Motion Patterns and Time-Frequency Domain Fusion [J]. Computer Science, 2025, 52(7): 92-102. |
| [9] | LIU Chengzhuang, ZHAI Sulan, LIU Haiqing, WANG Kunpeng. Weakly-aligned RGBT Salient Object Detection Based on Multi-modal Feature Alignment [J]. Computer Science, 2025, 52(7): 142-150. |
| [10] | ZHUANG Jianjun, WAN Li. SCF U2-Net:Lightweight U2-Net Improved Method for Breast Ultrasound Lesion SegmentationCombined with Fuzzy Logic [J]. Computer Science, 2025, 52(7): 161-169. |
| [11] | ZHENG Cheng, YANG Nan. Aspect-based Sentiment Analysis Based on Syntax,Semantics and Affective Knowledge [J]. Computer Science, 2025, 52(7): 218-225. |
| [12] | WANG Youkang, CHENG Chunling. Multimodal Sentiment Analysis Model Based on Cross-modal Unidirectional Weighting [J]. Computer Science, 2025, 52(7): 226-232. |
| [13] | KONG Yinling, WANG Zhongqing, WANG Hongling. Study on Opinion Summarization Incorporating Evaluation Object Information [J]. Computer Science, 2025, 52(7): 233-240. |
| [14] | LI Daicheng, LI Han, LIU Zheyu, GONG Shiheng. MacBERT Based Chinese Named Entity Recognition Fusion with Dependent Syntactic Information and Multi-view Lexical Information [J]. Computer Science, 2025, 52(6A): 240600121-8. |
| [15] | HUANG Bocheng, WANG Xiaolong, AN Guocheng, ZHANG Tao. Transmission Line Fault Identification Method Based on Transfer Learning and Improved YOLOv8s [J]. Computer Science, 2025, 52(6A): 240800044-8. |
|
||