Computer Science ›› 2021, Vol. 48 ›› Issue (6): 125-130.doi: 10.11896/jsjkx.200400107

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Generation of Realistic Image from Text Based on Feature Fusion

XU Ze, SHUAI Ren-jun, LIU Kai-kai, MA Li, WU Meng-lin   

  1. College of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China
  • Received:2020-04-23 Revised:2020-09-07 Online:2021-06-15 Published:2021-06-03
  • About author:XU Ze,born in 1994,postgraduate.His main research interests include image processing and machine learning.(
    SHUAI Ren-jun,born in 1962,postgra-duate,associate professor.His main research interests include artificial intelligence and intelligent medical care.
  • Supported by:
    National Natural Science Foundation of China(61701222).

Abstract: Recent challenging task of synthesizing images from text descriptions based on the generative adversarial network(GAN) has shown encouraging results.These methods can produce images with general shapes and colors,but often produce global images with unnatural local details and distortions.This is due to the inefficiency of the convolutional neural network in capturing high-level semantic information for pixel-level image synthesis and the fact that the generator-discriminator in a rough state generates flawed results for lack of detail,which then serves as input to the final result.We propose a generative adversarial network based on feature fusion,which introduces multi-scale feature fusion by embedding residual block feature pyramid structure,generates the final fine image directly by adaptive fusion of these features,and produces a 256px×256px realistic image with only one discriminator.The proposed method is verified on the flower data set Oxford-102 and Caltech bird database CUB,and the quality of generated images is evaluated by using Inception Score and FID.The results show that the quality of the generated images produced by the proposed method is better than images produced by some classical methods.

Key words: Discriminator, Feature fusion, Generative adversarial network, Residual block feature pyramid

CLC Number: 

  • TP391
[1]REED S,AKATA Z,YAN X,et al.Generative adversarial text to image synthesis[J].arXiv:1605.05396,2016.
[2]ZHANG H,XU T,LI H,et al.Stackgan:Text to photo-realistic image synthesis with stacked generative adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:5907-5915.
[3]ZHANG H,XU T,LI H,et al.Stackgan++:Realistic image synthesis with stacked generative adversarial networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(8):1947-1962.
[4]XU T,ZHANG P,HUANG Q,et al.Attngan:Fine-grained text to image generation with attentional generative adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1316-1324.
[5]WAH C,BRANSONS,WELINDERP,et al.The caltech-ucsd birds-200-2011 dataset.:CNS-TR-20111-001[R].State of California:California Institute of Technology,2011.
[6]NILSBACK M E,ZISSERMAN A.Automated flower classification over a large number of classes[C]//2008 Sixth Indian Conference on Computer Vision,Graphics & Image Processing.IEEE,2008:722-729.
[7]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013.
[8]GREGOR K,DANIHELKA I,GRAVES A,et al.Draw:A recurrent neural network for image generation[J].arXiv:1502.04623,2015.
[9]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial nets[J].Advances in Neural Information Processing Systems,2014,27:2672-2680.
[10]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1125-1134.
[11]DENTON E L,CHINTALA S,FERGUS R.Deep generativeimage models using a laplacian pyramid of adversarial networks[J].Advances in Neural Information Processing Systems,2015,28:1486-1494.
[12]ZHANG Z,XIE Y,YANG L.Photographic text-to-image synthesis with a hierarchically-nested adversarial network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6199-6208.
[13]SALIMANS T,GOODFELLOW I,ZAREMBA W,et al.Im-proved techniques for training gans[J].arXiv:1606.03498,2016.
[14]NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1520-1528.
[15]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected onvolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[1] ZHANG Jia, DONG Shou-bin. Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer [J]. Computer Science, 2022, 49(9): 41-47.
[2] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[3] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[4] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[5] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[6] XU Guo-ning, CHEN Yi-peng, CHEN Yi-ming, CHEN Jin-yin, WEN Hao. Data Debiasing Method Based on Constrained Optimized Generative Adversarial Networks [J]. Computer Science, 2022, 49(6A): 184-190.
[7] CHEN Yong-ping, ZHU Jian-qing, XIE Yi, WU Han-xiao, ZENG Huan-qiang. Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss [J]. Computer Science, 2022, 49(6A): 424-428.
[8] SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[9] YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[10] YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[11] YIN Wen-bing, GAO Ge, ZENG Bang, WANG Xiao, CHEN Yi. Speech Enhancement Based on Time-Frequency Domain GAN [J]. Computer Science, 2022, 49(6): 187-192.
[12] LAN Ling-xiang, CHI Ming-min. Remote Sensing Change Detection Based on Feature Fusion and Attention Network [J]. Computer Science, 2022, 49(6): 193-198.
[13] XU Hui, KANG Jin-meng, ZHANG Jia-wan. Digital Mural Inpainting Method Based on Feature Perception [J]. Computer Science, 2022, 49(6): 217-223.
[14] FAN Xin-nan, ZHAO Zhong-xin, YAN Wei, YAN Xi-jun, SHI Peng-fei. Multi-scale Feature Fusion Image Dehazing Algorithm Combined with Attention Mechanism [J]. Computer Science, 2022, 49(5): 50-57.
[15] LI Fa-guang, YILIHAMU·Yaermaimaiti. Real-time Detection Model of Insulator Defect Based on Improved CenterNet [J]. Computer Science, 2022, 49(5): 84-91.
Full text



No Suggested Reading articles found!