改进GAN网络在生成短视频的应用研究

doi:10.11896/jsjkx.210300114

计算机科学 ›› 2021, Vol. 48 ›› Issue (11A): 625-629.doi: 10.11896/jsjkx.210300114

改进GAN网络在生成短视频的应用研究

于晓明, 黄铧

陕西科技大学电子信息与人工智能学院西安710021

出版日期:2021-11-10 发布日期:2021-11-12
通讯作者: 黄铧(1792156660@qq.com)
作者简介:494636031@qq.com
基金资助:
陕西省科技厅项目(2014KRM80);咸阳市科技局项目(2013K15-07)

Research on Application of Improved GAN Network in Generating Short Video

YU Xiao-ming, HUANG Hua

College of Electronic Information and Artificial Intelligence,Shaanxi University of Science and Technology,Xi'an 710021,China

Online:2021-11-10 Published:2021-11-12
About author:YU Xiao-ming,born in 1965,associate professor.Her main research interests include intelligent information proces-sing,graphics and image processing
HUANG Hua,born in 1995,postgra-duate.His main research interests include graphics and image processing.
Supported by:
Science and Technology Department of Shaanxi Province,China(2014KRM80) and Project of Xianyang Science and Technology Bureau,China(2013K15-07).

摘要/Abstract

摘要： 在研究生成对抗网络(GAN)生成动态图像时,经常出现前后帧图像内容中的部分物体颜色不一致和生成的细节不自然等问题。针对当前生成视频的不理想问题,采用的主要方案是分别对GAN网络中的生成器和判别器进行改进,具体表现在两个方面:一方面是在生成器中对视频的前景和背景分别建模,并且使用多重空间自适应归一化(Multi Spatially-Adaptive Normalization,M-SPADE)算法;另一方面是在判别器的选取上使用双视频判别器(DVD-GAN),然后在Kinetics-600数据集进行训练,实验后的结果分别比对F-Vid2Vid,WC-Vid2Vid等生成方法。实验结果证明了对GAN网络改进的方法在处理生成短视频的前后帧颜色不一致的问题和细节上有着不错的效果,生成的图像相对的更加清晰。

关键词: 多重空间自适应归一化, 合成短视频, 前景-背景图像建模, 生成对抗网络, 双视频判别器

Abstract: In the study of the dynamic image generated by GAN,there are many problems,such as inconsistent colors of some objects and unnatural details of generated images.In order to solve the problem of unsatisfactory video generation,the main schemes adopted are to improve the generator and discriminator of GAN network respectively,which are shown in two aspects.On the one hand,the foreground and background of the videos are modeled separately in the generator and the Multi Spatial-Adaptive Normalization (M-Spade) algorithm is used.The other aspect is the use of dual video discriminator (DVD-GAN) on discriminator selection,which trained on Kinetics 600 dataset.The experimental results are compared with F-VID2VID,WC-VID2VID and other generation methods.The results show that the method of combining the two methods has a great effect on the problem of color inconsistency before and after the short video and the details processing,and the generated images are relatively clearer.

Key words: Dual video discriminator, Foreground-background image modeling, Generative adversarial networks, Multispace adaptive normalization, Synthesis short video

中图分类号:

TP391

于晓明, 黄铧. 改进GAN网络在生成短视频的应用研究[J]. 计算机科学, 2021, 48(11A): 625-629. https://doi.org/10.11896/jsjkx.210300114

YU Xiao-ming, HUANG Hua. Research on Application of Improved GAN Network in Generating Short Video[J]. Computer Science, 2021, 48(11A): 625-629. https://doi.org/10.11896/jsjkx.210300114

参考文献

[1]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//International Conference on Neural Information Processing Systems.MIT Press,2014:2672-2680.
[2]JIA Y F,MA L.Conditional self-attention generates adversarial networks[J].Journal of Xidian University,2019,46(6):163-170.
[3]BROCK A,DONAHUE J,SIMONYAN K.Large scale GANtraining for high fidelity natural image synthesis[C]//ICLR.2019.
[4]NITISH S,ELMAN M,RUSLAN S.Unsupervised learning of video representations using LSTMs[R].In ICML,2015.
[5]YAMAGUCHI A,CABATUAN M.Generative model basedframe generation of volcanic flow video[C]//IEEE International Conference on Humanoid.IEEE,2017:1-5.
[6]WANG T C,LIU M Y,ZHU J Y,et al.Video-to-Video Synthesis[J].arXiv:1808.06601.2018.
[7]WANG T C,LIU M Y,TAO R,et al.Few-shot Video-to-Video Synthesis[J].arXiv:1910.12713.2019.
[8]MALLYA A,WANG T C,SAPRA K,et al.World-ConsistentVideo-to-Video Synthesis[J].arXiv:2007.08509.2020.
[9]CLARK A,JEFF D,KAREN S.Efficient Video Generation on Complex Datasets[J].arXiv:1907.06571,2019.
[10]AAYUSH B,SHUGAO M,YASER S.Recycle-GAN:Unsupervised video retargeting[C]//ECCV.2018.
[11]ZHOU Y P,WANG Z W,CHEN F,et al.Dance Dance Generation:Motion Transfer for Internet Videos[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).IEEE,2020.
[12]VONDRICK C,PIRSIAVASH H,TORRALBA A.Generatingvideos with scene dynamics[C]//NeurIPS.2016.
[13]TAESUNG P,LIU M Y.Semantic Image Synthesis with Spatially-Adaptive Normalization[J].arXiv:1903.07291.
[14]ZHAO W Z,CHEN X,CHEN J G,et al.Sample Generation with Self-Attention Generative Adversarial Adaptation Network (SaGAAN) for Hyperspectral Image Classification[J].Remote Sensing,2020,12(5):843.
[15]AIDAN C,JEFF D,KAREN S,et al.Efficient Video Generation on Complex Datasets[J].arXiv:1907.06571.2019.
[16]TULYAKOV S,LIU M Y,YANG X D.MoCoGAN:Decomposing otion and content for video generation[C]//CVPR.2018.
[17]LENA G,MOSHE B,ELI S,et al.Actions as space-time shapes[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2008,29(12):2247-2253.
[18]ZHANG H,GOODFELLOW I J,METAXAS D,et al.Self-attention generative adversarial networks[J].arXiv:1805.08318,2018.
[19]ANDREW B,JEFF D,KAREN S.Large scale GAN training for high fidelity natural image synthesis[C]//ICLR.2019.
[20]SALIMANS T,ZAREMBA W,CHEUNG V.Improved tech-niques for training GANs[C]//NeurIPS.2016.
[21]HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.GANs trained by a twotime-scale update rule converge to a local nash equilibrium[C]//NeurIPS.2017.
[22]ZHENG S B,ZHOU G X,ZHANG B H,et.al.A map matching algorithm based on discrete Frechet distance[J].Journal of Hefei University of Technology (Natural Science),2017,40(1):42-46.

相关文章 15

[1]	张佳, 董守斌. 基于评论方面级用户偏好迁移的跨领域推荐算法 Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer 计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131
[2]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[3]	戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[4]	尹文兵, 高戈, 曾邦, 王霄, 陈怡. 基于时频域生成对抗网络的语音增强算法 Speech Enhancement Based on Time-Frequency Domain GAN 计算机科学, 2022, 49(6): 187-192. https://doi.org/10.11896/jsjkx.210500114
[5]	徐辉, 康金梦, 张加万. 基于特征感知的数字壁画复原方法 Digital Mural Inpainting Method Based on Feature Perception 计算机科学, 2022, 49(6): 217-223. https://doi.org/10.11896/jsjkx.210500105
[6]	高志宇, 王天荆, 汪悦, 沈航, 白光伟. 基于生成对抗网络的5G网络流量预测方法 Traffic Prediction Method for 5G Network Based on Generative Adversarial Network 计算机科学, 2022, 49(4): 321-328. https://doi.org/10.11896/jsjkx.210300240
[7]	黎思泉, 万永菁, 蒋翠玲. 基于生成对抗网络去影像的多基频估计算法 Multiple Fundamental Frequency Estimation Algorithm Based on Generative Adversarial Networks for Image Removal 计算机科学, 2022, 49(3): 179-184. https://doi.org/10.11896/jsjkx.201200081
[8]	石达, 芦天亮, 杜彦辉, 张建岭, 暴雨轩. 基于改进CycleGAN的人脸性别伪造图像生成模型 Generation Model of Gender-forged Face Image Based on Improved CycleGAN 计算机科学, 2022, 49(2): 31-39. https://doi.org/10.11896/jsjkx.210600012
[9]	唐雨潇, 王斌君. 基于深度生成模型的人脸编辑研究进展 Research Progress of Face Editing Based on Deep Generative Model 计算机科学, 2022, 49(2): 51-61. https://doi.org/10.11896/jsjkx.210400108
[10]	李建, 郭延明, 于天元, 武与伦, 王翔汉, 老松杨. 基于生成对抗网络的多目标类别对抗样本生成算法 Multi-target Category Adversarial Example Generating Algorithm Based on GAN 计算机科学, 2022, 49(2): 83-91. https://doi.org/10.11896/jsjkx.210800130
[11]	谈馨悦, 何小海, 王正勇, 罗晓东, 卿粼波. 基于Transformer交叉注意力的文本生成图像技术 Text-to-Image Generation Technology Based on Transformer Cross Attention 计算机科学, 2022, 49(2): 107-115. https://doi.org/10.11896/jsjkx.210600085
[12]	陈贵强, 何军. 自然场景下遥感图像超分辨率重建算法研究 Study on Super-resolution Reconstruction Algorithm of Remote Sensing Images in Natural Scene 计算机科学, 2022, 49(2): 116-122. https://doi.org/10.11896/jsjkx.210700095
[13]	蒋宗礼, 樊珂, 张津丽. 基于生成对抗网络和元路径的异质网络表示学习 Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning 计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[14]	张玮琪, 汤轶丰, 李林燕, 胡伏原. 基于场景图的段落生成序列图像方法 Image Stream From Paragraph Method Based on Scene Graph 计算机科学, 2022, 49(1): 233-240. https://doi.org/10.11896/jsjkx.201100207
[15]	徐涛, 田崇阳, 刘才华. 基于深度学习的人群异常行为检测综述 Deep Learning for Abnormal Crowd Behavior Detection:A Review 计算机科学, 2021, 48(9): 125-134. https://doi.org/10.11896/jsjkx.201100015

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

改进GAN网络在生成短视频的应用研究

Research on Application of Improved GAN Network in Generating Short Video

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0