Computer Science ›› 2022, Vol. 49 ›› Issue (2): 51-61.doi: 10.11896/jsjkx.210400108

• Computer Vision: Theory and Application • Previous Articles     Next Articles

Research Progress of Face Editing Based on Deep Generative Model

TANG Yu-xiao, WANG Bin-jun   

  1. College of Information Network Security,People's Public Security University of China,Beijing 100038,China
  • Received:2021-04-12 Revised:2021-07-06 Online:2022-02-15 Published:2022-02-23
  • About author:TANG Yu-xiao,born in 1997,master.Her main research interests include artificial intelligence and face editing.
    WANG Bin-jun,born in 1962,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include cyber security and artificial intelligence.
  • Supported by:
    Key Program of National Social Science Foundation(20AZD114),CCF-NSFOCUS “Kunpeng” Scientific Research Fund(CCF-NSFOCUS 2020011) and Open Research Fund of the Public Security Behavioral Science Laboratory,People's Public Security University of China (2020sys08).

Abstract: Face editing is widely used in public security pursuits,face beautification and other fields.Traditional statistical me-thods and prototype-based methods are the main means to solve face editing.However,these traditional technologies face pro-blems such as difficult operation and high computational cost.In recent years,with the development of deep learning,especially the emergence of generative networks,a brand new idea has been provided for face editing.Face editing technology using deep generative models has the advantages of fast speed and strong model generalization ability.In order to summarize and review the related theories and research on the use of deep generative models to solve the problem of face editing in recent years,firstly,we introduce the network framework and principles adopted by the face editing technology based on deep generative models.Then,the methods used in this technology are described in detail,and we summarize it into three aspects:image translation,introduction of conditional information within the network,and manipulation of potential space.Finally,we summarize the challenges faced by this technology,which consists of identity consistency,attribute decoupling,and attribute editing accuracy,and point out the issues of the technology that need to be resolved urgently in future.

Key words: Deep learning, Face editing, GAN, Latent space, VAE

CLC Number: 

  • TP309
[1]ZHU Z,LUO P,WANG X,et al.Recover canonical-view faces in the wild with deep neural networks[J].arXiv:1404.3543,2014.
[2]ZHANG Z,PENG Y.Eyeglasses removal from facial imagebased on mvlr[M]//The Era of Interactive Media.New York:Springer,2013:101-109.
[3]GOODFELLOW I,POUGET-ABADIE J,MIRZAM,et al.Generative adversarial nets[C]//Advances in Neural Information Processing Systems.2014:2672-2680
[4]BROCK A,DONAHUE J,SIMONYAN K.Large scale GANtraining for high fidelity natural image synthesis[J].arXiv:1809.11096,2018.
[5]KARRAS T,LAINE S,AILA T.A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4401-4410.
[6]MIRZA M,OSINDERO S.Conditional generative adversarialnets[J].arXiv:1411.1784,2014.
[7]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1125-1134.
[8]ZHU J Y,PARK T,ISOLA P,et al.Unpaired image-to-imagetranslation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2223-2232.
[9]LIU M Y,BREUEL T,KAUTZ J.Unsupervised image-to-image translation networks[J].arXiv:1703.00848,2018.
[10]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Compu-ter-assisted Intervention.Cham:Springer,2015:234-241.
[11]WANG T C,LIU M Y,ZHU J Y,et al.High-resolution image synthesis and semantic manipulation with conditional gans[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8798-8807.
[12]PORTENIER T,HU Q,SZABO A,et al.Faceshop:Deepsketch-based face image editing[J].arXiv:1804.08972,2018.
[13]JO Y,PARK J.SC-FEGAN:Face Editing Generative Adversa-rial Network with User's Sketch and Color[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:1745-1753.
[14]LEE C H,LIU Z,WU L,et al.Maskgan:Towards diverse and interactive facial image manipulation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:5549-5558.
[15]CHEN S Y,SU W,GAO L,et al.DeepFaceDrawing:deep gene-ration of face images from sketches[J].ACM Transactions on Graphics (TOG),2020,39(4):1-16.
[16]WANG Z,TANG X,LUO W,et al.Face aging with identity-preserved conditional generative adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7939-7947.
[17]LI M,ZUO W,ZHANG D.Deep identity-aware transfer of facial attributes[J].arXiv:1610.05586,2016.
[18]VIAZOVETSKYI Y,IVASHKIN V,KASHIN E.Stylegan2distillation for feed-forward image manipulation[J].arXiv:2003.03581,2020.
[19]PARK T,LIU M Y,WANG T C,et al.Semantic image synthesis with spatially-adaptive normalization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:2337-2346.
[20]TAN Z,CHAI M,CHEN D,et al.MichiGAN:multi-input-conditioned hair image generation for portrait editing[J].arXiv:2010.16417,2020.
[21]HU M,GUO J.Facial attribute-controlled sketch-to-imagetranslation with generative adversarial networks[J].EURASIP Journal on Image and Video Processing,2020,2020(1):1-13.
[22]HE Z,ZUO W,KAN M,et al.Arbitrary facial attribute editing:Only change what you want[J].arXiv:1711.10678,2017.
[23]KIM T,CHA M,KIM H,et al.Learning to discover cross-domain relations with generative adversarial networks[J].arXiv:1703.05192,2017.
[24]YI Z,ZHANG H,TAN P,et al.Dualgan:Unsupervised dual learning for image-to-image translation[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2849-2857.
[25]CHOI Y,CHOI M,KIM M,et al.Stargan:Unified generativeadversarial networks for multi-domain image-to-image translation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8789-8797.
[26]CHOI Y,UH Y,YOO J,et al.Stargan v2:Diverse image synthesis for multiple domains[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8188-8197.
[27]ZHAO B,CHANG B,JIE Z,et al.Modular generative adversa-rial networks[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:150-165.
[28]ANOOSHEH A,AGUSTSSON E,TIMOFTE R,et al.Combogan:Unrestrained scalability for image domain translation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2018:783-790.
[29]PALSSON S,AGUSTSSON E,TIMOFTE R,et al.Generativeadversarial style transfer networks for face aging[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2018:2084-2092.
[30]SONG L,LU Z,HE R,et al.Geometry guided adversarial facialexpression synthesis[C]//Proceedings of the 26th ACM International Conference on Multimedia.2018:627-635.
[31]PUMAROLA A,AGUDO A,MARTINEZA M,et al.Ganimation:Anatomically-aware facial animation from a single image[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:818-833.
[32]WU R,ZHANG G,LU S,et al.Cascade ef-gan:Progressive facial expression editing with local focuses[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:5021-5030.
[33]LI T,QIAN R,DONG C,et al.Beautygan:Instance-level facial makeup transfer with deep generative adversarial network[C]//Proceedings of the 26th ACM International Conference on Multimedia.2018:645-653.
[34]CHANG H,LU J,YU F,et al.Pairedcyclegan:Asymmetricstyle transfer for applying and removing makeup[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:40-48.
[35]WU P W,LIN Y J,CHANG C H,et al.Relgan:Multi-domain image-to-image translation via relative attributes[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2019:5914-5922.
[36]ZHANG H,CHEN W,TIAN J,et al.RAG:Facial AttributeEditing by Learning Residual Attributes[J].IEEE Access,2019,7:83266-83276.
[37]SHEN W,LIU R.Learning residual images for face attributemanipulation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4030-4038.
[38]HUANG X,LIU M Y,BELONGIE S,et al.Multimodal unsu-pervised image-to-image translation[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:172-189.
[39]LEE H Y,TSENG H Y,HUANG J B,et al.Diverse image-to-image translation via disentangled representations[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:35-51.
[40]LAMPLE G,ZEGHIDOUR N,USUNIER N,et al.Fader networks:Manipulating images by sliding attributes[J].arXiv:1706.00409,2017.
[41]ZHOU S,XIAO T,YANG Y,et al.Genegan:Learning objecttransfiguration and attribute subspace from unpaired data[J].arXiv:1705.04932,2017.
[42]XIAO T,HONG J,MA J.DNA-GAN:Learning disentangled representations from multi-attribute images[J].arXiv:1711.05415,2017.
[43]XIAO T,HONG J,MA J.Elegant:Exchanging latent encodings with gan for transferring multiple face attributes[C]//Procee-dings of the European Conference on Computer Vision (ECCV).2018:168-184.
[44]PERARNAU G,VAN DE WEIJER J,RADUCANU B,et al.Invertible conditional gans for image editing[J].arXiv:1611.06355,2016.
[45]NITZAN Y,BERMANO A,LI Y,et al.Disentangling in latentspace by harnessing a pretrained generator[J].arXiv:2005.07728,2020.
[46]LI X,HU J,ZHANG S,et al.Attribute guided unpaired image-to-image translation with semi-supervised learning[J].arXiv:1904.12428,2019.
[47]LIU Y,LI Q,SUN Z.Attribute-aware face aging with wavelet-based generative adversarial networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:11877-11886.
[48]OR-EL R,SENGUPTA S,FRIED O,et al.Lifespan Age Transformation Synthesis[J].arXiv:2003.09764,2020.
[49]LIU M,DING Y,XIA M,et al.Stgan:A unified selective transfer network for arbitrary image attribute editing[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:3673-3682.
[50]ZHANG Z,SONG Y,QI H.Age progression/regression by conditional adversarial autoencoder[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5810-5818.
[51]GUO J,LIU Y,QIAN Z,et al.Exemplar-based Generative Facial Editing[J].arXiv:2006.00472,2020.
[52]JIANG W,LIU S,GAO C,et al.Psgan:Pose and expression robust spatial-aware gan for customizable makeup transfer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:5194-5202.
[53]TEWARI A,ELGHARIB M,BHARAJ G,et al.Stylerig:Rigging stylegan for 3d control over portrait images[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:6142-6151.
[54]EGGER B,SMITH W A P,TEWARI A,et al.3d morphableface models-past,present,and future[J].ACM Transactions on Graphics (TOG),2020,39(5):1-38.
[55]KWAK J,HAN D K,KO H.CAFE-GAN:Arbitrary Face Attribute Editing with Complementary Attention Feature[J].arXiv:2011.11900,2020.
[56]LIU Y,FAN H,NI F,et al.ClsGAN:Selective Attribute Editing Model based on Classification Adversarial Network[J].Neural Networks,2017,133:220-228.
[57]XIE D,YANG M,DENG C,et al.Fully-Featured AttributeTransfer[J].arXiv:1902.06258,2019.
[58]CRESWELL A,BHARATH A A.Inverting the generator of a generative adversarial network[J].IEEE Transactions on Neural Networks and Learning Systems,2018,30(7):1967-1974.
[59]GU J,SHEN Y,ZHOU B.Image processing using multi-codegan prior[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:3012-3021.
[60]ABDAL R,QIN Y,WONKA P.Image2stylegan:How to embed images into the stylegan latent space?[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:4432-4441.
[61]ABDAL R,QIN Y,WONKA P.Image2StyleGAN++:How to Edit the Embedded Images?[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8296-8305.
[62]LARSEN A B L,SøNDERBY S K,LAROCHELLE H,et al.Autoencoding beyond pixels using a learned similarity metric[C]//International Conference on Machine Learning.PMLR,2016:1558-1566.
[63]LUO J,XU Y,TANG C,et al.Learning inverse mapping by autoencoder based generative adversarial nets[C]//International Conference on Neural Information Processing.Cham:Springer,2017:207-216.
[64]GUAN S,TAI Y,NI B,et al.Collaborative learning for faster stylegan embedding[J].arXiv:2007.01758,2020.
[65]RICHARDSON E,ALALUF Y,PATASHNIK O,et al.Encoding in style:a stylegan encoder for image-to-image translation[J].arXiv:2008.00951,2020.
[66]ZHU J,SHEN Y,ZHAO D,et al.In-domain gan inversion for real image editing[J].arXiv:2004.00049,2020.
[67]DONAHUE J,KRÄHENBÜHL P,DARRELL T.Adversarialfeature learning[J].arXiv:1605.09782,2016.
[68]DUMOULIN V,BELGHAZI I,POOLE B,et al.Adversarially learned inference[J].arXiv:1606.00704,2016.
[69]DONAHUE J,SIMONYAN K.Large scale adversarial repre-sentation learning[J].arXiv:1907.02544,2019.
[70]RADFORD A,METZ L,CHINTALA S.Unsupervised representation learning with deep convolutional generative adversarial networks[J].arXiv:1511.06434,2015.
[71]UPCHURCH P,GARDNER J,PLEISS G,et al.Deep feature interpolation for image content changes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7064-7073.
[72]SHEN Y,GU J,TANG X,et al.Interpreting the latent space of gans for semantic face editing[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9243-9252.
[73]JAHANIAN A,CHAI L,ISOLA P.On the “steerability” of generative adversarial networks[J].arXiv:1907.07171,2019.
[74]SummitKwan.TL-GAN:transparent latent-space GAN[EB/OL].https://github.com/SummitKwan/transparent_latent_gan.
[75]GOETSCHALCKX L,ANDONIAN A,OLIVA A,et al.Ganalyze:Toward visual definitions of cognitive image properties[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:5744-5753.
[76]KHOSLA A,RAJU A S,TORRALBA A,et al.Understanding and predicting image memorability at a large scale[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:2390-2398.
[77]ABDAL R,ZHU P,MITRA N,et al.Styleflow:Attribute-conditioned exploration of stylegan-generated images using conditio-nal continuous normalizing flows[J].arXiv:2008.02401.
[78]VAN T P,NGUYEN T M,TRAN N N,et al.Interpreting the Latent Space of Generative Adversarial Networks using Supervised Learning[C]//2020 International Conference on Advanced Computing and Applications (ACOMP).IEEE,2020:49-54.
[79]VOYNOV A,BABENKO A.Unsupervised Discovery of Interpretable Directions in the GAN Latent Space[J].arXiv:2002.03754,2020.
[80]HÄRKÜNEN E,HERTZMANN A,LEHTINEN J,et al.GAN-Space:Discovering Interpretable GAN Controls[J].arXiv:2004.02546,2020.
[81]SHEN Y,ZHOU B.Closed-form factorization of latent semantics in gans[J].arXiv:2007.06600,2020.
[82]PLUMERAULT A,BORGNE H L,HUDELOTC.Controlling generative models with continuous factors of variations[J].ar-Xiv:2001.10238,2020.
[83]BURGESS C P,HIGGINS I,PAL A,et al.Understanding disentangling in beta VAE[J].arXiv:1804.03599,2018.
[84]CHEN R T Q,LI X,GROSSE R,et al.Isolating sources of disentanglement in variational autoencoders[J].arXiv:1802.04942,2018.
[85]CHEN X,DUAN Y,HOUTHOOFT R,et al.Infogan:Inter-pretable representation learning by information maximizing gene-rative adversarial nets[J].arXiv:1606.03657,2016.
[86]MA C G,GUO Y Y,WU P,et al.Summary of Research onGenerated Anti-Network Image Enhancement[J].Information Network Security,2019 (5):10-21.
[1] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[5] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[6] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[7] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[8] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[9] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[10] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[11] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[12] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[13] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[14] SUN Fu-quan, CUI Zhi-qing, ZOU Peng, ZHANG Kun. Brain Tumor Segmentation Algorithm Based on Multi-scale Features [J]. Computer Science, 2022, 49(6A): 12-16.
[15] KANG Yan, XU Yu-long, KOU Yong-qi, XIE Si-yu, YANG Xue-kun, LI Hao. Drug-Drug Interaction Prediction Based on Transformer and LSTM [J]. Computer Science, 2022, 49(6A): 17-21.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!