计算机科学 ›› 2023, Vol. 50 ›› Issue (2): 221-230.doi: 10.11896/jsjkx.220800166

• 计算机图形学&多媒体 • 上一篇    下一篇


梁伟亮1, 李悦2, 王棚飞3   

  1. 1 中国人民大学区块链研究院 北京 100872
    2 深圳职业技术学院经济学院 广东 深圳 518055
    3 中华人民共和国应急管理部大数据中心 北京 100013
  • 收稿日期:2022-08-16 修回日期:2022-10-01 出版日期:2023-02-15 发布日期:2023-02-22
  • 通讯作者: 李悦(LiYue9303@szpt.edu.cn)
  • 作者简介:(751101457@qq.com)
  • 基金资助:

Lightweight Face Generation Method Based on TransEditor and Its Application Specification

LIANG Weiliang1, LI Yue2, WANG Pengfei3   

  1. 1 Blockchain Research Institute,Renmin University of China,Beijing 100872,China
    2 School of Economics,Shenzhen Polytechnics,Shenzhen,Guangdong 518055,China
    3 Big Data Center,Ministry of Emergency Management of the people's Republic of China,Beijing 100013,China
  • Received:2022-08-16 Revised:2022-10-01 Online:2023-02-15 Published:2023-02-22
  • Supported by:
    Youth Foundation of Social Science and Humanity,China Ministry of Education(21YJC820023) and China Postdoctoral Science Foundation(2022M713439)

摘要: 人脸生成可以将人脸的样式和头部的姿态进行组合,合成虚假的人脸图像,常用于性别转换、姿势修改等视觉任务。基于GAN的人脸生成方法大幅度提高了人脸生成的质量和可编辑性,但是这些生成方法网络结构复杂、计算资源需求大,很难直接应用于实际场景中。为了实现高效的人脸生成,提出了一种基于TransEditor的轻量化人脸生成方法,并探讨了相应的应用规范路径。在技术层面,首先,以TransEditor人脸编辑网络模型为基础,参考StyleGAN2等轻量化网络模型的生成器结构,设计了轻量化的人脸生成网络模型。其次,从生成损失、对抗损失、重建损失等方面分析了网络模型的损失函数,提出使用PReLU激活函数代替Softplus激活函数来提高生成器的生成效果。最后,大量实验证明,提出的基于TransEditor的轻量化人脸生成方法的LPIPS仅减少了0.0042,大幅度减少了模型的训练时间和参数量,提高了人脸生成模型的运行效率。在应用规范层面,需完善现有的规制措施,规范所提方法的使用,使技术进步更好地服务于社会发展。

关键词: 人脸生成, 生成对抗网络, Transformer网络, 轻量化, 应用规范

Abstract: Face generation can combine the style of the face and the pose of the head to synthesize fake face images,it is often used for vision tasks such as gender conversion and pose modification.GAN-based face generation methods can greatly improve the quality and editability of face generation.However,these generation methods have complex network structures and large computing resource requirements,and are difficult to directly apply to practical scenarios.To achieve efficient face generation,this paper proposes a lightweight face generation method based on TransEditor,and discusses the corresponding application specifications.At the technical level,firstly,based on the TransEditor face editing network model,we design a lightweight face generation network model with reference to the generator structure of lightweight network model such as StyleGAN2.Secondly,we analyze the loss function of the network model from the aspects of generation loss,confrontation loss,reconstruction loss,etc.,and propose to use the PReLU activation function instead of the Softplus activation function to improve the generation effect of the ge-nerator.Finally,through massive experiments,it is proved that the LPIPS of the proposed lightweight face generation method based on TransEditor only reduces by 0.0042,which greatly reduces the training time and parameter amount of the model,and improves the operation efficiency of the face generation model.At the level of application specifications,it is necessary to improve the existing regulatory measures and standardize the use of the proposed face generation method,so that technological progress can better serve social development.

Key words: Face generation, Generative adversarial network, Transformer network, Lightweight, Application specification


  • TP301.6
[1]KARRAS T,LAINE S,AITTALA M,et al.Analyzing and improving the image quality of stylegan[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8110-8119.
[2]KARRAS T,AITTALA M,LAINE S,et al.Alias-free generative adversarial networks[J].Advances in Neural Information Processing Systems,2021,34:852-863.
[3]BROCK A,DONAHUE J,SIMONYAN K.Large scale GANtraining for high fidelity natural image synthesis[J].arXiv:1809.11096,2018.
[4]KARRAS T,AILA T,LAINE S,et al.Progressive growing ofgans for improved quality,stability,and variation[J].arXiv:1710.10196,2017.
[5]KARRAS T,LAINE S,AILA T.A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4401-4410.
[6]KIM H,CHOI Y,KIM J,et al.Exploiting spatial dimensions of latent in gan for real-time image editing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:852-861.
[7]XU Y,YIN Y,JIANG L,et al.TransEditor:Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:7683-7692.
[8]KWON G,YE J C.Diagonal attention and style-based gan forcontent-style disentanglement in image generation and translation[C]//Proceedings of the IEEE/CVF International Confe-rence on Computer Vision.2021:13980-13989.
[9]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial networks[J].arXiv:1406,2661,2014.
[10]RADFORD A,METZ L,CHINTALA S.Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv:1511.06434,2015.
[11]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1125-1134.
[12]ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein ge-nerative adversarial networks[C]//International Conference on Machine Learning.PMLR,2017:214-223.
[13]LEDIG C,THEIS L,HUSZÁR F,et al.Photo-realistic singleimagesuper-resolution using a generative adversarial network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4681-4690.
[14]DAI F Y,CHI J,REN M G,et al.Face Image Synthesis Driven by Geometric Feature and Attribute Label[J].Computer Science,2022,49(10):214-223.
[15]SHI D,LU T L,DU Y H,et al.Generation Model of Gender-forged Face Image Based on Improved CycleGAN[J].Computer Science,2022,49(2):31-39.
[16]TANG Y X,WANG B J.Research Progress of Face Editingbased on Deep Generative Model[J].Computer Science,2022,49(2):51-61.
[17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].arXiv:1706,03762,2017.
[18]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020.
[19]ZHOU H,LIU Y,LIU Z,et al.Talking face generation by adversarially disentangled audio-visual representation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33(1):9299-9306.
[20]DING X,ZHANG X,MA N,et al.Repvgg:Making vgg-styleconvnets great again[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13733-13742.
[21]WANG W,ZHANG K,REN H,et al.UULPN:An ultra-lightweight network for human pose estimation based on unbiased data processing[J].Neurocomputing,2022,480:220-233.
[22]HAN K,WANG Y,TIAN Q,et al.Ghostnet:More featuresfrom cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:1580-1589.
[23]ZHANG Z,TANG J,WU G.Simple and lightweight humanpose estimation[J].arXiv:1911.10346,2019.
[24]YU C,XIAO B,GAO C,et al.Lite-hrnet:A lightweight high-resolution network[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2021:10440-10450.
[25]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[26]SANDLER M,HOWARD A,ZHU M,et al.MobileNetV2:Inverted Residuals and Linear Bottlenecks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018.
[27]HOWARD A,SANDLER M,CHU G,et al.Searching for mobilenetv3[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1314-1324.
[28]ZHANG X,ZHOU X,LIN M,et al.Shufflenet:An extremelyefficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856.
[29]MA N,ZHANG X,ZHENG H T,et al.Shufflenet v2:Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:116-131.
[30]JIAO Y L.Identification of Face-recognition's Tort Liability[J].Social Sciences in Chinese Higher Education Institutions,2022(2):117-128.
[31]XIONG B.The Risk and Limit of Expended Criminal Gover-nance of “Deepfake”[J].Journal of Anhui University:Philosophy and Social Science Edition,2020(6):106-113.
[32]JIA Z F.Legal Risks and Rules of Deepfake Technology[J].Journal of Northeast Agricultural University:Social Science Edition,2021(1):71-78.
[33]HUANG J X.Protection of Personal Privacy in “Deepfake”:Risks and Countermeasures [J].Journal of East China University of Science and Technology:Social Science Edition,2022(1):127-135.
[34]ZHOU K L,LI Y.Research on Legal Regulation of Face Data Application based on Responsive Theory[J].Southwest Finance,2019(12):78-87.
[35]YANG D.On the Reconstruction of Anti-monopoly Law:Re-sponding to the Challenge of the Digital Economy [J].China Legal Science,2020(3):206-222.
[36]WANG H M,CAI S L.Research on Legal risks of “Deep Forgery” Technology and its Collaborative Governance[J].Science and Technology Management Research,2021(11):156-163.
[1] 张佳, 董守斌.
Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer
计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131
[2] 孙奇, 吉根林, 张杰.
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[3] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[4] 郝强, 李杰, 张曼, 王路.
Spatial Non-cooperative Target Components Recognition Algorithm Based on Improved YOLOv3
计算机科学, 2022, 49(6A): 358-362. https://doi.org/10.11896/jsjkx.210700048
[5] 尹文兵, 高戈, 曾邦, 王霄, 陈怡.
Speech Enhancement Based on Time-Frequency Domain GAN
计算机科学, 2022, 49(6): 187-192. https://doi.org/10.11896/jsjkx.210500114
[6] 徐辉, 康金梦, 张加万.
Digital Mural Inpainting Method Based on Feature Perception
计算机科学, 2022, 49(6): 217-223. https://doi.org/10.11896/jsjkx.210500105
[7] 高志宇, 王天荆, 汪悦, 沈航, 白光伟.
Traffic Prediction Method for 5G Network Based on Generative Adversarial Network
计算机科学, 2022, 49(4): 321-328. https://doi.org/10.11896/jsjkx.210300240
[8] 黎思泉, 万永菁, 蒋翠玲.
Multiple Fundamental Frequency Estimation Algorithm Based on Generative Adversarial Networks for Image Removal
计算机科学, 2022, 49(3): 179-184. https://doi.org/10.11896/jsjkx.201200081
[9] 石达, 芦天亮, 杜彦辉, 张建岭, 暴雨轩.
Generation Model of Gender-forged Face Image Based on Improved CycleGAN
计算机科学, 2022, 49(2): 31-39. https://doi.org/10.11896/jsjkx.210600012
[10] 唐雨潇, 王斌君.
Research Progress of Face Editing Based on Deep Generative Model
计算机科学, 2022, 49(2): 51-61. https://doi.org/10.11896/jsjkx.210400108
[11] 李建, 郭延明, 于天元, 武与伦, 王翔汉, 老松杨.
Multi-target Category Adversarial Example Generating Algorithm Based on GAN
计算机科学, 2022, 49(2): 83-91. https://doi.org/10.11896/jsjkx.210800130
[12] 谈馨悦, 何小海, 王正勇, 罗晓东, 卿粼波.
Text-to-Image Generation Technology Based on Transformer Cross Attention
计算机科学, 2022, 49(2): 107-115. https://doi.org/10.11896/jsjkx.210600085
[13] 陈贵强, 何军.
Study on Super-resolution Reconstruction Algorithm of Remote Sensing Images in Natural Scene
计算机科学, 2022, 49(2): 116-122. https://doi.org/10.11896/jsjkx.210700095
[14] 瞿祥谋, 吴映波, 蒋晓玲.
Federated Data Augmentation Algorithm for Non-independent and Identical Distributed Data
计算机科学, 2022, 49(12): 33-39. https://doi.org/10.11896/jsjkx.220300031
[15] 孙长迪, 潘志松, 张艳艳.
Re-lightweight Method of MobileNet Based on Low-cost Deformable Convolution
计算机科学, 2022, 49(12): 312-318. https://doi.org/10.11896/jsjkx.211200036
Full text



No Suggested Reading articles found!