计算机科学 ›› 2021, Vol. 48 ›› Issue (1): 182-189.doi: 10.11896/jsjkx.191100092

• 计算机图形学与多媒体 • 上一篇    下一篇

基于改进生成对抗网络的动漫人物头像生成算法

张扬, 马小虎   

  1. 苏州大学计算机科学与技术学院 江苏 苏州 215000
  • 收稿日期:2019-11-12 修回日期:2020-03-20 出版日期:2021-01-15 发布日期:2021-01-15
  • 通讯作者: 马小虎(xhma@suda.edu.cn)
  • 作者简介:xhma@suda.edu.cn
  • 基金资助:
    江苏省自然科学基金(BK20141195);江苏高校优势学科建设工程

Anime Character Portrait Generation Algorithm Based on Improved Generative Adversarial Networks

ZHANG Yang, MA Xiao-hu   

  1. School of Computer Science & Technology,Soochow University,Suzhou,Jiangsu 215000,China
  • Received:2019-11-12 Revised:2020-03-20 Online:2021-01-15 Published:2021-01-15
  • About author:ZHANG Yang,born in 1996,master candidate,is a student member of China Computer Federation.His main research interests include generative adversarial networks and image processing.
    MA Xiao-hu,born in 1964,professor,master supervisor,is a advanced member of China Computer Federation.His main research interests include machine learning and image processing.
  • Supported by:
    Natural Science Foundation of Jiangsu,China(BK20141195) and Priority Academic Program Development of Jiangsu Higher Education Institutions.

摘要: 针对已有的动漫人物头像生成方法中生成结果的多样性较差,且难以准确地按照用户想法按类生成或按局部细节生成的问题,基于含辅助分类器的对抗生成网络(ACGAN),结合互信息理论、多尺度判别等提出了一种改进模型LMV-ACGAN(Latent label attached Multi scale ACGAN with improved VGG mode),用于动漫人物头像的生成。文中设计的模型主要包括特征整合的反卷积生成器,多尺度特征提取器以及真假、类别、隐参数,还原3个全连接神经网络。对于网络结构,所提模型除了类别标签外,额外引入了一组连续值的隐参数,用来增强对模型的约束,同时将卷积神经网络部分的VGG模型中的池化层替换为跨步卷积,并且判别器引入了图像的多尺度信息进行特征融合且改进了网络末端结构以及各部分的参数更新方式,以尽可能减弱末端的分类部分、真假判别部分和隐参数还原部分之间的相互影响。实验结果表明,所提模型有效地解决了模式崩塌的问题,同时较ACGAN提高了模型生成指定类型图像的成功率和准确度,对于ACGAN等生成失败或者类型判别错误的图像,可以做到正确生成,且能够通过调整连续的隐参数有效地实现一些简单的图像编辑功能,如人脸的朝向等。

关键词: ACGAN, 多尺度判别, 生成对抗网络, 图像编辑, 图像生成

Abstract: In order to solve the problems of poor diversity,generation by class and detail control in existed method,we present an improved model named LMV-ACGAN.It is based on ACGAN and involved with mutual information and multiscale discrimination.Our model includes a feature combined generator,a multiscale discriminator and three fully connected nets for real-fake judging,classifying and latent label restoration.As a semi-supervised generative model,except class label,we also use a group of continuous latent label to enhance the constraint of the generator.Moreover,in our algorithms,pooling layers in VGG-NET are replaced by stride convolutions.Then the discriminator uses the multiscale information of the image to feature fusion.Finally,we improve the tail-end structure of the model and the rules of parameters update so as to reduce the influence between classification,real-fake judgement and latent label restoration as far as possible.Our experiment shows that the proposed method effectively solve the problem of mode collapse on our dataset,meanwhile compared with origin ACGAN,our method increases the success rate and accuracy of generating specified class image.For the image which is generated poorly or classified incorrectly by ACGAN,our method can achieve the goal.In addition,our model enable people to modify the continuous latent label to realize image editing such as changing the face orientation.

Key words: ACGAN, Generative adversarial networks, Image edit, Image generation, Multi-scale discriminator

中图分类号: 

  • TP391
[1] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Conference on Neural Information Processing Systems.MIT Press,2014:2672-2680.
[2] KINGMA D P,WELLING M.Auto-encoding variational bayes[C]//International Conference on Learning Representations.ICLR,2014.
[3] RADFORD A,METZ L,CHINTALA S.Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J].arXiv:1511.06434,2016.
[4] ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein ge-nerative adversarial networks [C]//International Conference on Machine Learning.ACM,2017:298-321.
[5] GULRAJANI I,AHMED F,ARJOVSKY M,et al.Improved training of wasserstein GANs [C]//Conference on Neural Information Processing Systems.MIT Press,2017:5768-5778.
[6] MIRZA M,Osindero S.Conditional Generative Adversarial Nets[J].arXiv:1411.1784,2014.
[7] ODENA A,OLAH C,SHLENS J.Conditional image synthesis with auxiliary classifier gans [C]//International Conference on Machine Learning.ACM,2017:4043-4055.
[8] CHEN X,DUAN Y,HOUTHOOFT R,et al.InfoGAN:Interpretable representation learning by information maximizing generative adversarial nets [C]//Conference on Neural Information Processing Systems.MIT Press,2016:2180-2188.
[9] KARNEWAR A,WANG O,IYENGAR R S.MSG-GAN:Multi-Scale Gradient GAN for Stable Image Synthesis [J].arXiv:1903.06048,2019:9.
[10] KARRAS T,AILA T,LAINE S,et al.Progressive growing of GANs for improved quality,stability,and variation[C]//International Conference on Learning Representations.ICLR,2018.
[11] YONGYI L,YU-WING T,CHI-KEUNG T.Attribute-GuidedFace Generation Using Conditional CycleGAN[C]//Computer Vision.15th European Conference(ECCV).Springer,2018:293-308.
[12] CHEN Y,LAI Y,LIU Y.Cartoon GAN:Generative Adversarial Networks for Photo Cartoonization [C]//Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018:9465-9474.
[13] LIU Y,QIN Z,WAN T,et al.Auto-painter:Cartoon image generation from sketch by using conditional Wasserstein generative adversarial networks[J].Neurocomputing,2018,311:78-87.
[14] ZHANG L,JI Y,LIN X,et al.Style transfer for anime sketches with enhanced residual u-net and auxiliary classifier GAN [C]//Asian Conference on Pattern Recognition(ACPR).IEEE,2017:512-517.
[15] CI Y,MA X,WANG Z,et al.User-guided deep anime line art colorization with conditional adversarial networks[C]//26th ACM Multimedia Conference.ACM,2018:1536-1544.
[16] LU Q W,TAO Q C,ZHAO Y L,et al.Sketch SimplificationUsing Generative Adversarial Networks[J].Acat Automatica Sinica,2018.5(44):840-854.
[17] BAO R D,YU H,ZHU D F,et al.Automatic Makeup with Region Sensitive Generative Adversarial Networks[J].Journal of Software,2019,30(4):896-913.
[18] MAO X,LI Q,XIE H,et al.Least Squares Generative Adversarial Networks[C]//International Conference on Computer Vision(ICCV).IEEE,2017:2813-2821.
[19] ARJOVSKY M,BOTTOU L.Towards principled methods for training generative adversarial networks [C]//International Conference on Learning Representations.ICLR,2019.
[20] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognitio [C]//International Conference on Learning Representations.ICLR,2015.
[21] HONG Y,HWANG U,YOO J,et al.How generative adversarialnetworks and their variants work:An overview[J].ACM Computing Surveys,2019,52(1):Article 10.
[22] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks[C]//Conference on Neural Information Processing Systems.MIT Press,2012:1097-1105.
[23] IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//International Conference on Machine Learning.ACM,2015:448-456.
[24] HEUSEL M,RAMSAUER H,UNTERTHINER T,et al. GANstrained by a two time-scale update rule converge to a local Nash equilibrium[C]//Conference on Neural Information Processing Systems.MIT Press,2017:6627-6638.
[1] 张佳, 董守斌.
基于评论方面级用户偏好迁移的跨领域推荐算法
Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer
计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131
[2] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[3] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[4] 尹文兵, 高戈, 曾邦, 王霄, 陈怡.
基于时频域生成对抗网络的语音增强算法
Speech Enhancement Based on Time-Frequency Domain GAN
计算机科学, 2022, 49(6): 187-192. https://doi.org/10.11896/jsjkx.210500114
[5] 徐辉, 康金梦, 张加万.
基于特征感知的数字壁画复原方法
Digital Mural Inpainting Method Based on Feature Perception
计算机科学, 2022, 49(6): 217-223. https://doi.org/10.11896/jsjkx.210500105
[6] 高志宇, 王天荆, 汪悦, 沈航, 白光伟.
基于生成对抗网络的5G网络流量预测方法
Traffic Prediction Method for 5G Network Based on Generative Adversarial Network
计算机科学, 2022, 49(4): 321-328. https://doi.org/10.11896/jsjkx.210300240
[7] 黎思泉, 万永菁, 蒋翠玲.
基于生成对抗网络去影像的多基频估计算法
Multiple Fundamental Frequency Estimation Algorithm Based on Generative Adversarial Networks for Image Removal
计算机科学, 2022, 49(3): 179-184. https://doi.org/10.11896/jsjkx.201200081
[8] 石达, 芦天亮, 杜彦辉, 张建岭, 暴雨轩.
基于改进CycleGAN的人脸性别伪造图像生成模型
Generation Model of Gender-forged Face Image Based on Improved CycleGAN
计算机科学, 2022, 49(2): 31-39. https://doi.org/10.11896/jsjkx.210600012
[9] 唐雨潇, 王斌君.
基于深度生成模型的人脸编辑研究进展
Research Progress of Face Editing Based on Deep Generative Model
计算机科学, 2022, 49(2): 51-61. https://doi.org/10.11896/jsjkx.210400108
[10] 李建, 郭延明, 于天元, 武与伦, 王翔汉, 老松杨.
基于生成对抗网络的多目标类别对抗样本生成算法
Multi-target Category Adversarial Example Generating Algorithm Based on GAN
计算机科学, 2022, 49(2): 83-91. https://doi.org/10.11896/jsjkx.210800130
[11] 谈馨悦, 何小海, 王正勇, 罗晓东, 卿粼波.
基于Transformer交叉注意力的文本生成图像技术
Text-to-Image Generation Technology Based on Transformer Cross Attention
计算机科学, 2022, 49(2): 107-115. https://doi.org/10.11896/jsjkx.210600085
[12] 陈贵强, 何军.
自然场景下遥感图像超分辨率重建算法研究
Study on Super-resolution Reconstruction Algorithm of Remote Sensing Images in Natural Scene
计算机科学, 2022, 49(2): 116-122. https://doi.org/10.11896/jsjkx.210700095
[13] 张玮琪, 汤轶丰, 李林燕, 胡伏原.
基于场景图的段落生成序列图像方法
Image Stream From Paragraph Method Based on Scene Graph
计算机科学, 2022, 49(1): 233-240. https://doi.org/10.11896/jsjkx.201100207
[14] 蒋宗礼, 樊珂, 张津丽.
基于生成对抗网络和元路径的异质网络表示学习
Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[15] 林椹尠, 张梦凯, 吴成茂, 郑兴宁.
利用生成对抗网络的人脸图像分步补全法
Face Image Inpainting with Generative Adversarial Network
计算机科学, 2021, 48(9): 174-180. https://doi.org/10.11896/jsjkx.200800014
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!