计算机科学 ›› 2021, Vol. 48 ›› Issue (1): 241-246.doi: 10.11896/jsjkx.200700187

• 人工智能 • 上一篇    下一篇

基于自注意力机制的条件生成对抗网络

于文家, 丁世飞   

  1. 中国矿业大学计算机科学与技术学院 江苏 徐州 221116
  • 收稿日期:2020-07-29 修回日期:2020-09-22 出版日期:2021-01-15 发布日期:2021-01-15
  • 通讯作者: 丁世飞 ( dingsf@cumt.edu.cn)
  • 作者简介:ts18170032a31@cumt.edu.cn
  • 基金资助:
    国家自然科学基金(61672522,61976216)

Conditional Generative Adversarial Network Based on Self-attention Mechanism

YU Wen-jia, DING Shi-fei   

  1. School of Computer Science and Technology,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China
  • Received:2020-07-29 Revised:2020-09-22 Online:2021-01-15 Published:2021-01-15
  • About author:YU Wen-jia,born in 1994,postgradua-te,is a student member of China Computer Federation.His main research interests include deep learning and computer vision.
    DING Shi-fei,born in 1963,Ph.D,professor,Ph.D supervisor,is a director of China Computer Federation.His main research interests include artificial intelligence,machine learning,pattern recognition and data mining.
  • Supported by:
    National Natural Science Foundation of China (61672522,61976216).

摘要: 近年来,越来越多的生成对抗网络出现在深度学习的各个领域中。条件生成对抗网络(Conditional Generative Adver-sarial Networks,cGAN)开创性地将监督学习引入到无监督的GAN网络中,这使得GAN可以生成有标签数据。传统的GAN通过多次卷积运算来模拟不同区域之间的相关性,进而生成图像,而cGAN只是对GAN的目标函数加以改进,并没有改变其网络结构,因此cGAN生成的图像中仍然存在长距离特征之间相关性相对较小的问题,从而导致cGAN生成图像的细节不清楚。为了解决这个问题,将自注意力机制引入cGAN中,并提出了一个新的模型SA-cGAN。该模型通过将图像中相距较远的特征相互关联起来生成一致的对象或场景,进而提升生成对抗网络生成细节的能力。将SA-cGAN在CelebA和MNIST手写数据集上进行了实验,并将其与DCGAN,cGAN等几种常用的生成模型进行了比较,结果证明该模型相比其他几种模型在图像生成领域有一定的进步。

关键词: cGAN, SA-cGAN, 深度学习, 生成对抗网络, 自注意力

Abstract: In recent years,more and more generative adversarial networks appear in various fields of deep learning.Conditional generative adversarial networks(cGAN) are the first to introduce supervised learning into unsupervised GANs,which makes it possible for adversarial networks to generate labeled data.Traditional GAN generates images through multiple convolution operations to simulate the dependency among different regions.However,cGAN only improves the objective function of GAN,but does not change its network structure.Therefore,the problem also exists in cGAN that when the distance between features in thegene-rated image is long,features have relatively less relationship,resulting in unclear details of the generated image.In order to solve this problem,this paper introduces Self-attention mechanism to cGAN and proposes a new model named SA-cGAN.The model generates consistent objects or scenes by using features in the long distance of the image,so that the generative ability of conditional GAN is improved.SA-cGAN is experimented on the CelebA and MNIST handwritten datasets and compared with several commonly used generative models such as DCGAN,cGAN.Results prove that the proposed model has made some progress in the field of image generation.

Key words: cGAN, Deep learning, Generative adversarial network, SA-cGAN, Self-attention

中图分类号: 

  • TP391
[1] GOODFELLOW I J,POUGET A J,MIRZA M,et al.Generative Adversarial Nets[J].arXiv:1406.2661.
[2] CAO Y J,JIA L L,CHEN Y X,et al.Review of computer vision based on generative adversarial networks[J].Journal of Image and Graphics,2018,23(10):1433-1449.
[3] WANG K F,GOU C,DUAN Y J,et al.Generative Adversarial Networks:The State of the Art and Beyond[J].ACTA Automatica Sinica,2017,43(3):321-332.
[4] LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436.
[5] JÜRGEN S.Deep learning in neural networks:An overview[J].Neural Netw,2015,61:85-117.
[6] CHENG J,WANG P S,LI G,et al.Recent advances in efficient computation of deep convolutional neural networks[J].Frontiers of Information Technology & Electronic Engineering,2018,19(1):67-80.
[7] KOZIARSKI M,CYGANEK B.Impact of Low Resolution on Image Recognition with Deep Neural Networks:An Experimental Study[J].International Journal of Applied Mathematics and Computer Science,2018,28(4):735-744.
[8] RADFORD A,METZ L,CHINTALA S.Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J].arXiv:1511.06434v2,2016.
[9] MIRZA M,OSINDERO S.Conditional Generative AdversarialNets[J].arXiv:Learning,2014.
[10] ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein GAN[J].arXiv:1701.07875v3,2017.
[11] FUGLEDE B,TOPSOE F.Jensen-Shannon divergence and Hilbert space embedding[C]//International Symposium on Information Theory.IEEE,2004:31.
[12] LU B,HANCOCK E R.Graph Kernels from the Jensen-Shannon Divergence[J].Journal of Mathematical Imaging and Vision,2013,47(1):60-69.
[13] GULRAJANI I,AHMED F,ARJOVSKY M,et al.ImprovedTraining of Wasserstein GANs[J].arXiv:1704.00028v3,2017.
[14] LAWRENCE S,GILES C L,TSOI A C,et al.Face recognition:a convolutional neural-network approach[J].IEEE Transactions on Neural Networks,1997,8(1):98-113.
[15] VRHEL M,SABER E,TRUSSELL H J.Color image generation and display technologies[J].IEEE Signal Processing Magazine,2005,22(1):23-33.
[16] BODLA N,GANG H,CHELLAPPA R.Semi-supervisedFusedGAN for Conditional Image Generation[C]//Computer Vision and Pattern Recognition.2018:669-683.
[17] STEFAN D,RUSSO R,DAVID M,et al.Disjunction Category Labels[C]//Nordic Conference on Information Security Technology for Applications.Springer-Verlag,2011.
[18] GOLDSTONE R L,LIPPA Y,SHIFFRIN R M.Altering object representations through category learning[J].Cognition,2001,78(1):27-43.
[19] ZHANG N,DING S F,ZHANG J.Multi Layer ELM-RBF for Multi-Label Learning[J].Applied Soft Computing,2016,43(6):535-545.
[20] STOCKMAN,GEORGE C.Computer vision[M].PrenticeHall,2001.
[21] CAO K,WU,LUO L Z,et al.Face completion algorithm based on condition generation adversarial network[J].Transducer and Microsystem Technologie,2019,38(6):129-132.
[22] TANG X L,DU Y M,LIU Y W,et al.Image Recognition With Conditional Deep Convolutional Generative Adversarial Networks[J].ACTA Automatica Sinica,2018,44(5):855-864.
[23] ZHANG H,GOODFELLOW I,METAXAS D,et al.Self-Attention Generative Adversarial Networks[J].arXiv:1805.08318v2,2019.
[24] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is All you Need[C]//Neural Information Processing Systems.2017:5998-6008.
[25] LU J J,GONG Y.Text sentiment classification model based on self-attention and expanded convolutional neural network[J].Computer Engineering and Design,2020,41(6):1645-1651.
[26] COLLOBERT R,WESTON J,BOTTOU L,et al.Natural Language Processing (Almost) from Scratch[J].Journal of Machine Learning Research,2011,12:2493-2537.
[27] LIU Z W,LUO P,WANG X G,et al.Large-scale celebfaces attributes (celeba) dataset[J].Retrieved August,2018,15.
[28] LI D.The MNIST Database of Handwritten Digit Images for Machine Learning Research[J].IEEE Signal Processing Magazine,2012,29(6):141-142.
[29] KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[J].arXiv:1412.6980v9,2014.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 张佳, 董守斌.
基于评论方面级用户偏好迁移的跨领域推荐算法
Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer
计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131
[5] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[6] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[7] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[8] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[9] 方义秋, 张震坤, 葛君伟.
基于自注意力机制和迁移学习的跨领域推荐算法
Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning
计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011
[10] 陈坤峰, 潘志松, 王家宝, 施蕾, 张锦.
基于双目叠加仿生的微换衣行人再识别
Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation
计算机科学, 2022, 49(8): 165-171. https://doi.org/10.11896/jsjkx.210600140
[11] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[12] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[13] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[14] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[15] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!