计算机科学 ›› 2020, Vol. 47 ›› Issue (9): 142-149.doi: 10.11896/jsjkx.190900203

• 计算机图形学&多媒体 • 上一篇    下一篇

基于改进CycleGan模型和区域分割的表情动画合成

叶亚男1,2, 迟静1,2, 于志平1,2, 战玉丽1,2, 张彩明1,2,3,4   

  1. 1 山东财经大学计算机科学与技术学院 济南250014
    2 山东省数字媒体技术重点实验室 济南250014
    3 山东大学软件学院 济南250101
    4 未来智能计算协同创新中心 山东 烟台264003
  • 收稿日期:2019-06-16 发布日期:2020-09-10
  • 通讯作者: 迟静(peace_world_cj@126.com)
  • 作者简介:1325809478@qq.com
  • 基金资助:
    山东省省属优青项目(ZR2018JL022);国家自然科学基金(61772309,61602273);山东省重点研发计划(2019GSF109112);山东省教育厅科技计划项目(J18RA272);山东省高等学校优势学科人才团队培育计划

Expression Animation Synthesis Based on Improved CycleGan Model and Region Segmentation

YE Ya-nan1,2, CHI Jing1,2, YU Zhi-ping1,2, ZHAN Yu-li1,2and ZHANG Cai-ming1,2,3,4   

  1. 1 School of Computer Science and Technology,Shandong University of Finance and Economics,Jinan 250014,China
    2 Shandong Provincial Key Laboratory of Digital Media Technology,Jinan 250014,China
    3 School of Software,Shandong University,Jinan 250101,China
    4 Future Intelligent Computing Collaborative Innovation Center,Yantai,Shandong 264003,China
  • Received:2019-06-16 Published:2020-09-10
  • About author:YE Ya-nan,born in 1994,master,postgraduate.Her main research interests include computer animation and digital image processing.
    CHI Jing,born in 1980,Ph.D,associate professor,postgraduate supervisor.Her main research interests includecompu-ter animation,geometric shape,and me-dical image processing.
  • Supported by:
    Natural Science Foundation of Shandong Province for Excellent Young Scholars in Provincial Universities (ZR2018JL022),National Natural Science Foundation of China (61772309,61602273),Shandong Provincial Key R&D Program (2019GSF109112),Science and Technology Program of Shandong Education Department (J18RA272) and Fostering Project of Dominant Discipline and Talent Team of Shandong Province Higher Education Institutions.

摘要: 针对现有人脸表情合成大多依赖于数据源驱动,且存在生成效率低、真实感差的问题,提出一种基于改进CycleGan模型和区域分割的表情动画合成新方法。新方法可实时地合成新表情动画,且具有较好的稳定性和鲁棒性。所提方法在传统CycleGan模型的循环一致损失函数中构造新的协方差约束条件,可有效避免新表情图像生成时出现的色彩异常和模糊不清等现象;提出分区域训练的思想,用Dlib人脸识别数据库对人脸图像进行关键点检测,通过检测到的关键特征点将源域和目标域的人脸分割成左眼、右眼、嘴部和剩余人脸部分共4个区域块,并利用改进的CycleGan模型对每块区域单独进行训练;最后将训练结果加权融合成最终的新表情图像。分区域训练进一步增强了表情合成的真实感。实验数据来自英国萨里大学的语音视觉情感(SAVEE)数据库,在Tensorflow框架下,用python 3.4软件进行实验结果的展示。实验表明,新方法无需数据源驱动,可直接在源人脸动画序列上实时地生成真实、自然的新表情序列,且对于语音视频可保证新面部表情序列与源音频同步。

关键词: CycleGan, 表情合成, 区域分割, 深度学习, 协方差约束

Abstract: Aiming at the problems of mostly relying on data source driver,low generation efficiency and poor authenticity of the existing facial expression synthesis methods,this paper proposes a new method for expression animation synthesis based on the improved CycleGan model and region segmentation.This new method can synthesize new expression in real time and has good stability and robustness.The proposed method constructs a new covariance constraint in the cycle consistent loss function of the traditional CycleGan model,which can effectively avoid color anomalies and image blurring in generation of new expression images.The idea of zonal training is put forward.The Dlib face recognition database is used to detect the key points of the face images.The detected key feature points are used to segment the face in domain source and target domain into four zones:left eye,right eye,mouth and the rest of the face.The improved CycleGan model is used to train each region separately,and finally the training results are weighted and fused into the final new expression image.The zonal training further enhances the authenticity of expression synthesis.The experimental data comes from the SAVEE database,and the experimental results are presented with python 3.4 software under the Tensorflow framework.Experiments show that the new method can directly generate real and natu-ral new expression sequences in real time on the original facial expression sequence without data source driver.Furthermore,for the voice video,it can effectively ensure the synchronization between the generated facial expression sequence and the source audio.

Key words: Covariance constraint, CycleGan, Deep learning, Facial expression synthesis, Region segmentation

中图分类号: 

  • TP391.41
[1] ZHU J,PARK T,ISOLA P,et al.Unpaired Image-to-ImageTranslation Using Cycle-Consistent Adversarial Networks[C]//2017 IEEE International Conference on Computer Vision (ICCV).2017:2242-2251.
[2] PIGHIN F,HECKER J,LISCHINSKI D,et al.Synthesizing Realistic Facial Expressions from Photographs[C]//Proceedings of the ACM SIGGRAPH Conference on Computer Graphics.1998:75-84.
[3] BLANZ V,BASSO C,VETEER T,et al.Reanimating Faces in Images and Video[C]//European Association for Computer Graphics.2003:641-650.
[4] VLASIC D,BRAND M,PFISTER H,et al.Face Transfer with Multilinear Models[J].ACM Transactions on Graphics,2006,24(3):426-433.
[5] LV P,XU M L.Expression of Face Expre-ssions Unrelated to Expression Database[J].Journal of Computer-Aided Design & Computer Graphics,2016,28(1).
[6] PASQUARIELLO S,PELACHAUD C.GRETA:A Simple Facial Animation Engine[M]//Soft Computing and Industry.London:Springer,2002.
[7] ZHANG Q,LIU Z,GUO B,et al.Geometry-Driven Photorealistic Facial Expression Synthesis[J].IEEE Transactions on Visualization & Computer Graphics,2005,12(1):48-60.
[8] JOSHI P,TIEN W C,DESBRUN M,et al.Learning controls for blendshape based realistic facial animation[C]//Proceedings of ACM SIGGRAPH Eurographics Symposium on Computer Aimation.2003:187-192.
[9] PARK B,CHUNG H,NISHITA T,et al.A feature-based approach to facial expression cloning:Virtual Humans and Social Agents[J].Computer Animation and Virtual Worlds,2005,16(3/4):291-303.
[10] JOSHI P,TIEN W C,DESBRUN M,et al.Learing Cnotrols for Blend Shape Based Realistic Facial Animation[C]//ACM Transactions on Graphics.2006:426-433.
[11] GARRIDO P,VALGAERTS L,REHMSEN O,et al.Automatic face reenactment[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2014:4217-4224.
[12] WESIE T,BOUAZIZ S,LI H,et al.Realtime performance-based facial animation[J].ACM Transactions on Graphics,2011,30(4):1.
[13] CAO C,WENG Y,LIN S,et al.3D shape regression for real-time facial animation[J].ACM Transactions on Graphics,2013,32(4):1.
[14] HUANG X Q,LIN Y X,SONG M L.Three-dimensional facial expression synthesis method based on nonlinear joint learning[J].Journal of Computer-Aided Design & Computer Graphics,2011,23(2).
[15] WILLIAMS L.Performance-driven facial animation [C]//ACM SIGGRAPH Computer Graphics.1990:235-242.
[16] YANG F,WANG J,SHECHTMANE,et al.Expression flow for 3D-aware face component transfer[J].ACM Transactions on Graphics,2011,30(4):1.
[17] PEREZ P,GANGNET M,BLAKE A.Poisson image editing[J].ACM Transactions on Graphics,2003,22(3):313-318.
[18] BITOUK D.Face Swapping:Automatically Replacing Faces in Photographs[J].ACM SIGGRAPH,2008,27(3):1-8.
[19] DALE K,SUNKAVALLI K,JOHNSON M K,et al.Video face replacement[J].ACM Transactions on Graphics,2011,30(6):1.
[20] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative Adversarial Nets[C]//International Conference on Neural Information Processing Systems.MIT Press,2014:2672-2680.
[21] KAZEMI V,SULLIVAN J.One Millisecond Face Alignmentwith an Ensemble of Regression Trees[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2014:1867-1874.
[22] JOHNSON J,ALAHI A,LI F F.Perceptual Losses for Real-Time Style Transfer and Super-Resolution[M]//Computer Vision-ECCV 2016.Springer International Publishing,2016:694-711.
[23] IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//International Conference on International Conference on Machine Learning.JMLR.org,2015.
[24] ISOPLA P,ZHU J Y,ZHOU T,et al.Image-to-Image Translation with Conditional Adversarial Networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017.
[25] KINFMA D P,BA J.Adam:A Method for Stochastic Optimization[J].arXiv:1412.6980v8,2014.
[26] HAQ S,JACKSON P J.Multimodal emotion recognition[M]//Machine Audition:Principles,Algorithms and Systems,2010,17:398-423.
[27] CHOI Y,CHOI M,KIM M,et al.StarGAN:Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation[C]//CVPR.2017.
[28] ABADI M,AGARWAL A,BARHAM P,et al.Tensorflow:Large-scale machine learning on heterogeneous distributed systems[J].arXiv:1603.04467,2016.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[9] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[10] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[11] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[12] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[13] 王君锋, 刘凡, 杨赛, 吕坦悦, 陈峙宇, 许峰.
基于多源迁移学习的大坝裂缝检测
Dam Crack Detection Based on Multi-source Transfer Learning
计算机科学, 2022, 49(6A): 319-324. https://doi.org/10.11896/jsjkx.210500124
[14] 楚玉春, 龚航, 王学芳, 刘培顺.
基于YOLOv4的目标检测知识蒸馏算法研究
Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4
计算机科学, 2022, 49(6A): 337-344. https://doi.org/10.11896/jsjkx.210600204
[15] 周志豪, 陈磊, 伍翔, 丘东亮, 梁广升, 曾凡巧.
基于SMOTE-SDSAE-SVM的车载CAN总线入侵检测算法
SMOTE-SDSAE-SVM Based Vehicle CAN Bus Intrusion Detection Algorithm
计算机科学, 2022, 49(6A): 562-570. https://doi.org/10.11896/jsjkx.210700106
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!