计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 100-105.doi: 10.11896/jsjkx.210600036

• 计算机图形学&多媒体 • 上一篇    下一篇

全局信息引导的真实图像风格迁移

张颖涛, 张杰, 张睿, 张文强   

  1. 复旦大学计算机科学技术学院 上海200011
    上海市智能信息处理重点实验室(复旦大学) 上海200011
  • 收稿日期:2021-06-03 修回日期:2021-10-17 出版日期:2022-07-15 发布日期:2022-07-12
  • 通讯作者: 张睿(zhangrui@fudan.edu.cn)
  • 作者简介:(yingtaozhang19@fudan.edu.cn)

Photorealistic Style Transfer Guided by Global Information

ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang   

  1. School of Computer Science,Fudan University,Shanghai 200011,China
    Shanghai Key Laboratory of Intelligent Information Processing,Fudan University,Shanghai 200011,China
  • Received:2021-06-03 Revised:2021-10-17 Online:2022-07-15 Published:2022-07-12
  • About author:ZHANG Ying-tao,born in 1997,postgraduate.His main research interests include computer vision,deep learning and machine learning.
    ZHANG Rui,born in 1973,Ph.D,senior engineer.His main research interests include embedded system,digital signal process and mobile communications.

摘要: 不同于艺术风格迁移,真实图像风格迁移的挑战在于,迁移结果在迁移风格图片的色调风格的同时在内容上应保持真实性。目前,真实图像风格迁移的方法往往是在艺术风格迁移方法的基础上进行预处理或后处理,以保持生成图片的真实性。但艺术风格迁移方法通常无法充分利用全局色彩信息实现更为协调的整体观感,且预处理和后处理操作往往繁琐而费时。针对以上问题,建立了全局信息引导的真实图像风格迁移网络,提出了色域均值损失(Lcpm)来衡量生成图片与风格图片全局色彩分布的相似性,对自适应实例归一化(AdaIN)进行改进,提出分区自适应实例归一化(AdaIN-P),以更好地适应真实图像的色彩风格迁移;此外,引入了一种跨通道分区注意力机制,以更好地利用全局上下文信息,提升生成图片的整体协调性。 上述方法能够引导网络解码器充分利用全局信息。实验结果表明,相较于其他主流方法,所提网络模型能在保持图像细节的同时实现更好的真实图像风格迁移效果。

关键词: 编码解码, 风格迁移, 卷积神经网络, 全局信息, 特征融合, 注意力机制

Abstract: Different from artistic style transfer,the challenge of photorealistic style transfer is to maintain the authenticity of the output while transferring the color style of the style input.Now,most photorealistic style transfer methods perform pre-proces-sing or post-processing based on artistic style transfer methods,to maintain the authenticity of the output image.However,artistic style transfer methods usually cannot make full use of global color information to achieve a more coordinated overall impression,and pre-processing and post-processing operations are often tedious and time-consuming.To solve the above problems,this paper establishes a photorealistic style transfer network guided by global information,and proposes a color-partition-mean loss(Lcpm) to measure the similarity of the global color distribution between output and the style input.Adaptive instance normalization(AdaIN) is improved,and partition adaptive instance normalization(AdaIN-P) is proposed to better adapt to the color style transfer of real images.In addition,this paper also introduces a cross-channel partition attention module to make better use of global context information and improve the overall coordination of output images.Through the above methods,the decoder of network is guided to make full use of global information.Experimental results show that,compared with other state-of-the-art me-thods,the proposed model can achieve a better photorealistic style transfer effect while maintaining image details.

Key words: Attention mechanism, Convolution neural network, Encoder and decoder, Feature fusion, Global information, Style transfer

中图分类号: 

  • TP391
[1]GATYS L A,ECKER A S,Bethge M.Texture Synthesis Using Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems.2015:262-270.
[2]GATYS L A,ECKER A S,BETHGE M.Image style transferusing convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2016:2414-2423.
[3]LUAN F,PARIS S,SHECHTMAN E,et al.Deep Photo Style Transfer[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.NJ:IEEE,2017:4990-4998.
[4]LI Y,LIU M,LI X,et al.A closed-form solution to photorealistic image stylization[C]//Proceedings of the European Confe-rence on Computer vision.Berlin:Springer,2018:453-468.
[5]YOO J,UH Y,CHUN S,et al.Photorealistic Style Transfer via Wavelet Transforms[C]//Proceedings of the IEEE InternationalConference on Computer Vision.NJ:IEEE,2019:9036-9045.
[6]AN J,XIONG H,HUAN J,et al.Ultrafast Photorealistic Style Transfer via Neural Architecture Search[C]//AAAI Conference on Artificial Intelligence.2020:10443-10450.
[7]HUANG X,BELONGIE S.Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization[C]//Proceedings of the IEEE International Conference on Computer Vision.NJ:IEEE,2017:1501-1510.
[8]LI Y,FANG C,YANG J,et al.Universal style transfer via feature transforms[C]//Advances in Neural Information Proces-sing Systems.2017:386-396.
[9]JOHNSON J,ALAHI A,FEI-FEI L.Perceptual Losses forReal-Time Style Transfer and Super-Resolution[C]//Procee-dings of the European Conference on Computer Vision.Berlin:Springer,2016:694-711.
[10]CHEN D,YUAN L,LIAO J,et al.StyleBank:An Explicit Representation for Neural Image Style Transfer[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE International Conference on Computer Vision.NJ:IEEE,2017:1897-1906.
[11]ULYANOV D,VEDALDI A,LEMPITSKY V.Instance nor-malization:The missing ingredient for fast stylization[J].ar-Xiv:1607.08022,2016.
[12]GHIASI G,LEE H,KUDLUR M,et al.Exploring the structure of a real-time,arbitrary neural artistic stylization network[J].arXiv:1705.06830,2017.
[13]HERTZMAN A,JACOBS C E,OLIVER N,et al.Image analogies[C]//Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques.2001:327-340.
[14]ASHIKHMIN N.Fast Texture Transfer[J].IEEE Computer Graphics & Applications,2003,23(4):38-43.
[15]ULYANOV D,VEDALDI A,LEMPITSKY V.Improved texture networks:Maximizing quality and diversity in feed-forward stylization and texture synthesis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:6924-6932.
[16]REINHARD E,ASHIKHMIN M,GOOCH B,et al.ColorTransfer between Images[J].IEEE Computer Graphics and Applications,2001,21(5):34-41.
[17]WELSH T,ASHIKHMIN M,MUELLER K.Transferring color to greyscale images[C]//Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques.New York:ACM,2002:277-280.
[18]ZOPH B,LE Q V.Neural architecture search with reinforcement learning[J].arXiv:1611.01578,2016.
[19]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[20]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation [C]//International Conference on Medical Image Computing and Compu-ter-assisted Intervention.Berlin:Springer,2015:234-241.
[21]HUANG Z,WANG X,HUANG L,et al.Ccnet:Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.NJ:IEEE,2019:603-612.
[22]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.NJ:IEEE,2018:7132-7141.
[23]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[9] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[10] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[11] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[12] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[13] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[14] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[15] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!