计算机科学 ›› 2021, Vol. 48 ›› Issue (6): 125-130.doi: 10.11896/jsjkx.200400107
徐泽, 帅仁俊, 刘开凯, 马力, 吴梦麟
XU Ze, SHUAI Ren-jun, LIU Kai-kai, MA Li, WU Meng-lin
摘要: 近年来,基于生成对抗网络(Generative Adversarial Network,GAN)从文本描述中合成图像这一具有挑战性的任务已经取得了令人鼓舞的结果。这些方法虽然可以生成具有一般形状和颜色的图像,但通常也会生成具有不自然的局部细节且扭曲的全局图像。这是因为卷积神经网络在捕获用于像素级别图像合成的高级语义信息时效率低下,以及处于粗略状态的生成器-鉴别器由于缺少详细信息生成了有缺陷的结果,而这个结果会作为输入促使最终结果的生成。因此,提出了一种基于特征融合的生成对抗网络。该网络通过嵌入残差块特征金字塔结构来引入多尺度特征融合,并通过自适应融合这些特征直接生成最后的精细图像,仅使用一个鉴别器就可以生成256px×256px的逼真图像。将所提方法在花类数据集Oxford-102和加利福尼亚理工学院鸟类数据库CUB上进行验证,使用Inception Score和FID评估生成图像的质量,结果表明,生成图像的质量明显优于以往若干经典的方法。
中图分类号:
[1]REED S,AKATA Z,YAN X,et al.Generative adversarial text to image synthesis[J].arXiv:1605.05396,2016. [2]ZHANG H,XU T,LI H,et al.Stackgan:Text to photo-realistic image synthesis with stacked generative adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:5907-5915. [3]ZHANG H,XU T,LI H,et al.Stackgan++:Realistic image synthesis with stacked generative adversarial networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(8):1947-1962. [4]XU T,ZHANG P,HUANG Q,et al.Attngan:Fine-grained text to image generation with attentional generative adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1316-1324. [5]WAH C,BRANSONS,WELINDERP,et al.The caltech-ucsd birds-200-2011 dataset.:CNS-TR-20111-001[R].State of California:California Institute of Technology,2011. [6]NILSBACK M E,ZISSERMAN A.Automated flower classification over a large number of classes[C]//2008 Sixth Indian Conference on Computer Vision,Graphics & Image Processing.IEEE,2008:722-729. [7]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013. [8]GREGOR K,DANIHELKA I,GRAVES A,et al.Draw:A recurrent neural network for image generation[J].arXiv:1502.04623,2015. [9]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial nets[J].Advances in Neural Information Processing Systems,2014,27:2672-2680. [10]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1125-1134. [11]DENTON E L,CHINTALA S,FERGUS R.Deep generativeimage models using a laplacian pyramid of adversarial networks[J].Advances in Neural Information Processing Systems,2015,28:1486-1494. [12]ZHANG Z,XIE Y,YANG L.Photographic text-to-image synthesis with a hierarchically-nested adversarial network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6199-6208. [13]SALIMANS T,GOODFELLOW I,ZAREMBA W,et al.Im-proved techniques for training gans[J].arXiv:1606.03498,2016. [14]NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1520-1528. [15]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected onvolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708. |
[1] | 张佳, 董守斌. 基于评论方面级用户偏好迁移的跨领域推荐算法 Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer 计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131 |
[2] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[3] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[4] | 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105 |
[5] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[6] | 郁舒昊, 周辉, 叶春杨, 王太正. SDFA:基于多特征融合的船舶轨迹聚类方法研究 SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion 计算机科学, 2022, 49(6A): 256-260. https://doi.org/10.11896/jsjkx.211100253 |
[7] | 杨玥, 冯涛, 梁虹, 杨扬. 融合交叉注意力机制的图像任意风格迁移 Image Arbitrary Style Transfer via Criss-cross Attention 计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236 |
[8] | 陈永平, 朱建清, 谢懿, 吴含笑, 曾焕强. 基于外接圆半径差损失的实时安全帽检测算法 Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss 计算机科学, 2022, 49(6A): 424-428. https://doi.org/10.11896/jsjkx.220100252 |
[9] | 孙洁琪, 李亚峰, 张文博, 刘鹏辉. 基于离散小波变换的双域特征融合深度卷积神经网络 Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation 计算机科学, 2022, 49(6A): 434-440. https://doi.org/10.11896/jsjkx.210900199 |
[10] | 尹文兵, 高戈, 曾邦, 王霄, 陈怡. 基于时频域生成对抗网络的语音增强算法 Speech Enhancement Based on Time-Frequency Domain GAN 计算机科学, 2022, 49(6): 187-192. https://doi.org/10.11896/jsjkx.210500114 |
[11] | 蓝凌翔, 池明旻. 基于特征注意力融合网络的遥感变化检测研究 Remote Sensing Change Detection Based on Feature Fusion and Attention Network 计算机科学, 2022, 49(6): 193-198. https://doi.org/10.11896/jsjkx.210500058 |
[12] | 徐辉, 康金梦, 张加万. 基于特征感知的数字壁画复原方法 Digital Mural Inpainting Method Based on Feature Perception 计算机科学, 2022, 49(6): 217-223. https://doi.org/10.11896/jsjkx.210500105 |
[13] | 李发光, 伊力哈木·亚尔买买提. 基于改进CenterNet的航拍绝缘子缺陷实时检测模型 Real-time Detection Model of Insulator Defect Based on Improved CenterNet 计算机科学, 2022, 49(5): 84-91. https://doi.org/10.11896/jsjkx.210400142 |
[14] | 董奇达, 王喆, 吴松洋. 结合注意力机制与几何信息的特征融合框架 Feature Fusion Framework Combining Attention Mechanism and Geometric Information 计算机科学, 2022, 49(5): 129-134. https://doi.org/10.11896/jsjkx.210300180 |
[15] | 李鹏祖, 李瑶, Ibegbu Nnamdi JULIAN, 孙超, 郭浩, 陈俊杰. 基于多特征融合的重叠组套索脑功能超网络构建及分类 Construction and Classification of Brain Function Hypernetwork Based on Overlapping Group Lasso with Multi-feature Fusion 计算机科学, 2022, 49(5): 206-211. https://doi.org/10.11896/jsjkx.210300049 |
|