计算机科学 ›› 2021, Vol. 48 ›› Issue (1): 241-246.doi: 10.11896/jsjkx.200700187

• 人工智能 • 上一篇    下一篇

基于自注意力机制的条件生成对抗网络

于文家, 丁世飞   

  1. 中国矿业大学计算机科学与技术学院 江苏 徐州 221116
  • 收稿日期:2020-07-29 修回日期:2020-09-22 出版日期:2021-01-15 发布日期:2021-01-15
  • 通讯作者: 丁世飞 ( dingsf@cumt.edu.cn)
  • 作者简介:ts18170032a31@cumt.edu.cn
  • 基金资助:
    国家自然科学基金(61672522,61976216)

Conditional Generative Adversarial Network Based on Self-attention Mechanism

YU Wen-jia, DING Shi-fei   

  1. School of Computer Science and Technology,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
  • Received:2020-07-29 Revised:2020-09-22 Online:2021-01-15 Published:2021-01-15
  • About author:YU Wen-jia,born in 1994,postgradua-te,is a student member of China Computer Federation.His main research interests include deep learning and computer vision.
    DING Shi-fei,born in 1963,Ph.D,professor,Ph.D supervisor,is a director of China Computer Federation.His main research interests include artificial intelligence,machine learning,pattern recognition and data mining.
  • Supported by:
    National Natural Science Foundation of China (61672522,61976216).

摘要: 近年来,越来越多的生成对抗网络出现在深度学习的各个领域中。条件生成对抗网络(Conditional Generative Adver-sarial Networks,cGAN)开创性地将监督学习引入到无监督的GAN网络中,这使得GAN可以生成有标签数据。传统的GAN通过多次卷积运算来模拟不同区域之间的相关性,进而生成图像,而cGAN只是对GAN的目标函数加以改进,并没有改变其网络结构,因此cGAN生成的图像中仍然存在长距离特征之间相关性相对较小的问题,从而导致cGAN生成图像的细节不清楚。为了解决这个问题,将自注意力机制引入cGAN中,并提出了一个新的模型SA-cGAN。该模型通过将图像中相距较远的特征相互关联起来生成一致的对象或场景,进而提升生成对抗网络生成细节的能力。将SA-cGAN在CelebA和MNIST手写数据集上进行了实验,并将其与DCGAN,cGAN等几种常用的生成模型进行了比较,结果证明该模型相比其他几种模型在图像生成领域有一定的进步。

关键词: 深度学习, 生成对抗网络, cGAN, 自注意力, SA-cGAN

Abstract: In recent years,more and more generative adversarial networks appear in various fields of deep learning.Conditional generative adversarial networks(cGAN) are the first to introduce supervised learning into unsupervised GANs,which makes it possible for adversarial networks to generate labeled data.Traditional GAN generates images through multiple convolution operations to simulate the dependency among different regions.However,cGAN only improves the objective function of GAN,but does not change its network structure.Therefore,the problem also exists in cGAN that when the distance between features in thegene-rated image is long,features have relatively less relationship,resulting in unclear details of the generated image.In order to solve this problem,this paper introduces Self-attention mechanism to cGAN and proposes a new model named SA-cGAN.The model generates consistent objects or scenes by using features in the long distance of the image,so that the generative ability of conditional GAN is improved.SA-cGAN is experimented on the CelebA and MNIST handwritten datasets and compared with several commonly used generative models such as DCGAN,cGAN.Results prove that the proposed model has made some progress in the field of image generation.

Key words: Deep learning, Generative adversarial network, cGAN, Self-attention, SA-cGAN

中图分类号: 

  • TP391
[1] GOODFELLOW I J,POUGET A J,MIRZA M,et al.Generative Adversarial Nets[J].arXiv:1406.2661.
[2] CAO Y J,JIA L L,CHEN Y X,et al.Review of computer vision based on generative adversarial networks[J].Journal of Image and Graphics,2018,23(10):1433-1449.
[3] WANG K F,GOU C,DUAN Y J,et al.Generative Adversarial Networks:The State of the Art and Beyond[J].ACTA Automatica Sinica,2017,43(3):321-332.
[4] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436.
[5] JÜRGEN S.Deep learning in neural networks:An overview[J].Neural Netw,2015,61:85-117.
[6] CHENG J,WANG P S,LI G,et al.Recent advances in efficient computation of deep convolutional neural networks[J].Frontiers of Information Technology & Electronic Engineering,2018,19(1):67-80.
[7] KOZIARSKI M,CYGANEK B.Impact of Low Resolution on Image Recognition with Deep Neural Networks:An Experimental Study[J].International Journal of Applied Mathematics and Computer Science,2018,28(4):735-744.
[8] RADFORD A,METZ L,CHINTALA S.Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J].arXiv:1511.06434v2,2016.
[9] MIRZA M,OSINDERO S.Conditional Generative AdversarialNets[J].arXiv:Learning,2014.
[10] ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein GAN[J].arXiv:1701.07875v3,2017.
[11] FUGLEDE B,TOPSOE F.Jensen-Shannon divergence and Hilbert space embedding[C]//International Symposium on Information Theory.IEEE,2004:31.
[12] LU B,HANCOCK E R.Graph Kernels from the Jensen-Shannon Divergence[J].Journal of Mathematical Imaging and Vision,2013,47(1):60-69.
[13] GULRAJANI I,AHMED F,ARJOVSKY M,et al.ImprovedTraining of Wasserstein GANs[J].arXiv:1704.00028v3,2017.
[14] LAWRENCE S,GILES C L,TSOI A C,et al.Face recognition:a convolutional neural-network approach[J].IEEE Transactions on Neural Networks,1997,8(1):98-113.
[15] VRHEL M,SABER E,TRUSSELL H J.Color image generation and display technologies[J].IEEE Signal Processing Magazine,2005,22(1):23-33.
[16] BODLA N,GANG H,CHELLAPPA R.Semi-supervisedFusedGAN for Conditional Image Generation[C]//Computer Vision and Pattern Recognition.2018:669-683.
[17] STEFAN D,RUSSO R,DAVID M,et al.Disjunction Category Labels[C]//Nordic Conference on Information Security Technology for Applications.Springer-Verlag,2011.
[18] GOLDSTONE R L,LIPPA Y,SHIFFRIN R M.Altering object representations through category learning[J].Cognition,2001,78(1):27-43.
[19] ZHANG N,DING S F,ZHANG J.Multi Layer ELM-RBF for Multi-Label Learning[J].Applied Soft Computing,2016,43(6):535-545.
[20] STOCKMAN,GEORGE C.Computer vision[M].PrenticeHall,2001.
[21] CAO K,WU,LUO L Z,et al.Face completion algorithm based on condition generation adversarial network[J].Transducer and Microsystem Technologie,2019,38(6):129-132.
[22] TANG X L,DU Y M,LIU Y W,et al.Image Recognition With Conditional Deep Convolutional Generative Adversarial Networks[J].ACTA Automatica Sinica,2018,44(5):855-864.
[23] ZHANG H,GOODFELLOW I,METAXAS D,et al.Self-Attention Generative Adversarial Networks[J].arXiv:1805.08318v2,2019.
[24] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All you Need[C]//Neural Information Processing Systems.2017:5998-6008.
[25] LU J J,GONG Y.Text sentiment classification model based on self-attention and expanded convolutional neural network[J].Computer Engineering and Design,2020,41(6):1645-1651.
[26] COLLOBERT R,WESTON J,BOTTOU L,et al.Natural Language Processing (Almost) from Scratch[J].Journal of Machine Learning Research,2011,12:2493-2537.
[27] LIU Z W,LUO P,WANG X G,et al.Large-scale celebfaces attributes (celeba) dataset[J].Retrieved August,2018,15.
[28] LI D.The MNIST Database of Handwritten Digit Images for Machine Learning Research[J].IEEE Signal Processing Magazine,2012,29(6):141-142.
[29] KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[J].arXiv:1412.6980v9,2014.
[1] 张扬, 马小虎. 基于改进生成对抗网络的动漫人物头像生成算法[J]. 计算机科学, 2021, 48(1): 182-189.
[2] 王瑞平, 贾真, 刘畅, 陈泽威, 李天瑞. 基于DeepFM的深度兴趣因子分解机网络[J]. 计算机科学, 2021, 48(1): 226-232.
[3] 仝鑫, 王斌君, 王润正, 潘孝勤. 面向自然语言处理的深度学习对抗样本综述[J]. 计算机科学, 2021, 48(1): 258-267.
[4] 丁钰, 魏浩, 潘志松, 刘鑫. 网络表示学习算法综述[J]. 计算机科学, 2020, 47(9): 52-59.
[5] 何鑫, 许娟, 金莹莹. 行为关联网络:完整的变化行为建模[J]. 计算机科学, 2020, 47(9): 123-128.
[6] 叶亚男, 迟静, 于志平, 战玉丽, 张彩明. 基于改进CycleGan模型和区域分割的表情动画合成[J]. 计算机科学, 2020, 47(9): 142-149.
[7] 邓良, 许庚林, 李梦杰, 陈章进. 基于深度学习与多哈希相似度加权实现快速人脸识别[J]. 计算机科学, 2020, 47(9): 163-168.
[8] 暴雨轩, 芦天亮, 杜彦辉. 深度伪造视频检测技术综述[J]. 计算机科学, 2020, 47(9): 283-292.
[9] 孟丽莎, 任坤, 范春奇, 黄泷. 基于密集卷积生成对抗网络的图像修复[J]. 计算机科学, 2020, 47(8): 202-207.
[10] 袁野, 和晓歌, 朱定坤, 王富利, 谢浩然, 汪俊, 魏明强, 郭延文. 视觉图像显著性检测综述[J]. 计算机科学, 2020, 47(7): 84-91.
[11] 谢源, 苗玉彬, 许凤麟, 张铭. 基于半监督深度卷积生成对抗网络的注塑瓶表面缺陷检测模型[J]. 计算机科学, 2020, 47(7): 92-96.
[12] 王文刀, 王润泽, 魏鑫磊, 漆云亮, 马义德. 基于堆叠式双向LSTM的心电图自动识别算法[J]. 计算机科学, 2020, 47(7): 118-124.
[13] 刘燕, 温静. 基于注意力机制的复杂场景文本检测[J]. 计算机科学, 2020, 47(7): 135-140.
[14] 张志扬, 张凤荔, 谭琪, 王瑞锦. 基于深度学习的信息级联预测方法综述[J]. 计算机科学, 2020, 47(7): 141-153.
[15] 蒋文斌, 符智, 彭晶, 祝简. 一种基于4Bit编码的深度学习梯度压缩算法[J]. 计算机科学, 2020, 47(7): 220-226.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[2] 罗霄阳,霍宏涛,王梦思,陈亚飞. 基于多残差马尔科夫模型的图像拼接检测[J]. 计算机科学, 2018, 45(4): 173 -177 .
[3] 郭帅,刘亮,秦小麟. 用户偏好约束的空间关键词范围查询处理方法[J]. 计算机科学, 2018, 45(4): 182 -189 .
[4] 郭俊霞,郭仁飞,许南山,赵瑞莲. 基于Session的Web应用软件EFSM模型构建方法研究[J]. 计算机科学, 2018, 45(4): 203 -207 .
[5] 侯彦娥,孔云峰,党兰学. 求解多车型校车路径问题的混合集合划分的GRASP算法[J]. 计算机科学, 2018, 45(4): 240 -246 .
[6] 秦克云,林洪. 决策形式背景属性约简的关系[J]. 计算机科学, 2018, 45(4): 257 -259 .
[7] 金瑞, 刘作学. 一种采用时隙对准方式的TDMA自组网同步协议[J]. 计算机科学, 2018, 45(6): 84 -88 .
[8] 张蜀男, 蔡英, 范艳芳, 夏红科. 云存储中高效密文检索的中文数据加密方案[J]. 计算机科学, 2018, 45(6): 124 -129 .
[9] 张盼盼, 彭长根, 郝晨艳. 一种基于隐私偏好的隐私保护模型及其量化方法[J]. 计算机科学, 2018, 45(6): 130 -134 .
[10] 沈夏炯, 张俊涛, 韩道军. 基于梯度提升回归树的短时交通流预测模型[J]. 计算机科学, 2018, 45(6): 222 -227 .