极化自注意力约束颜色溢出的图像自动上色

doi:10.11896/jsjkx.220100149

计算机科学 ›› 2023, Vol. 50 ›› Issue (3): 208-215.doi: 10.11896/jsjkx.220100149

• 计算机图形学&多媒体 • 上一篇下一篇

极化自注意力约束颜色溢出的图像自动上色

刘航¹, 普园媛^1,2, 吕大华¹, 赵征鹏¹, 徐丹¹, 钱文华¹

1 云南大学信息学院昆明 650504
2 云南省高校物联网技术及应用重点实验室昆明 650504

收稿日期:2022-01-16 修回日期:2022-09-08 出版日期:2023-03-15 发布日期:2023-03-15
通讯作者: 普园媛(yuanyuanpu@ynu.edu.cn)
作者简介:(lhkaka824@163.com)
基金资助:
国家自然科学基金(62162068,61271361,61761046,62061049);云南省应用基础研究面上项目(2018FB100);云南省科技厅应用基础研究计划重点项目(202001BB050043,2019FA044)

Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image

LIU Hang¹, PU Yuanyuan^1,2, LYU Dahua¹, ZHAO Zhengpeng¹, XU Dan¹, QIAN Wenhua¹

1 School of Information Science and Engineering,Yunnan University,Kunming 650504,China
2 University Key Laboratory of Internet of Things Technology and Application,Yunnan Province,Kunming 650504,China

Received:2022-01-16 Revised:2022-09-08 Online:2023-03-15 Published:2023-03-15
About author:LIU Hang,born in 1995,postgraduate.His main research interests include deep learning and image colorization.
PU Yuanyuan,born in 1972,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include digital image proces-sing,non-photorealistic rendering,scientific understanding of visual arts.
Supported by:
National Natural Science Foundation of China(62162068,61271361,61761046,62061049),Yunnan Science and Technology Department Project(2018FB100) and Key Program of the Applied Basic Research Programs of Yunnan(202001BB050043,2019FA044).

摘要/Abstract

摘要： 自动上色可以将灰度图像转换为色彩合理的自然彩色图像,可以为老旧照片、黑白影视作品等重新恢复颜色,因此在计算机视觉和图形学领域受到广泛关注。然而,为灰度图像分配色彩是一项极具挑战性的任务,存在颜色溢出问题。为解决该问题,提出了一种极化自注意力约束颜色溢出的图像自动上色方法。首先,将前景中的实例和背景分开,降低背景对前景的上色影响,从而减少前景和背景之间的颜色溢出;然后,使用极化自注意力模块把特征分为颜色通道和空间位置两部分,使上色更加准确、具体,从而减少全局图像、实例对象内的颜色溢出;最后,结合融合模块,将全局特征和实例特征通过不同权重融合为一体,完成最终上色。实验结果表明,与ChromaGAN,MemoGAN等算法相比,所提方法在主要指标FID,LPIPS上分别提升了9.7%和10.9%,且SSIM和PSNR指标均达到最优。

关键词: 图像上色, 深度学习, 目标检测, 自注意力, 颜色溢出

Abstract: Auto coloring transforms grayscale images into reasonable colored versions of natural color images,allowing the restoration of color for old photographs,black-and-white films,videos,etc.Therefore,it is widely concerned in the realms of computer vision and graphics.Nevertheless,allocating colors to grayscale images is a highly challenging mission with a color overflow pro-blem.To address the problem,a technique for automatic coloring of images with polarized self-attention constrained color overflow is proposed.At first,separating instances in the foreground from the background minimizes the coloring effect of the background against the foreground,to mitigate the color overflow among the foreground and background.Second,the polarized self-attention block splits the features into color channels and spatial locations for more accurate and specific coloring,which reduces the color overflow within the global image,instance objects.At last,the fusion module is combined to integrate the global features and instance features through different weights to accomplish the ultimate coloring.Experiment results show that the main indexes FID and LPIPS are improved by 9.7% and 10.9% respectively,and the indexes SSIM and PSNR are optimal compared with ChromaGAN and MemoGAN.

Key words: Image coloring, Deep learning, Target detection, Self attention, Color overflow

中图分类号:

TP391

刘航, 普园媛, 吕大华, 赵征鹏, 徐丹, 钱文华. 极化自注意力约束颜色溢出的图像自动上色[J]. 计算机科学, 2023, 50(3): 208-215. https://doi.org/10.11896/jsjkx.220100149

LIU Hang, PU Yuanyuan, LYU Dahua, ZHAO Zhengpeng, XU Dan, QIAN Wenhua. Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image[J]. Computer Science, 2023, 50(3): 208-215. https://doi.org/10.11896/jsjkx.220100149

参考文献

[1]QU Y,WONG T T,HENG P A.Manga colorization[J].ACM Transactions on Graphics,2006,25(3):1214-1220.
[2]DOU Z,WNAG N,WANG S J,et al.Sketch colorization method with drawing prior[J].Computer Science,2022,49(4):195-202.
[3]LIU X,WAN L,QU Y,et al.Intrinsic colorization[J].ACM Transactions on Graphics,2008,27(5):1-9.
[4]CHIA Y S,ZHUO S,GUPTA R K,et al.Semantic colorization with internet images[J].Acm Transactions on Graphics,2011,30(6CD):156.1-156.7.
[5]ZHANG R,ZHU J Y,ISOLA P,et al.Real-time user-guided image colorization with learned deep priors[J].Acm Transactions on Graphics,2017,36(4):1-11
[6]ZHANG R,ISOLA P,EFROS A A.Colorful Image Colorization[C]//European Conference on Computer Vision.Springer International Publishing.2016:649-666.
[7]ZHAO J,LIU L,SNOEK C,et al.Pixel-level semantics guided image colorization[J].arXiv:1808.01597,2018.
[8]ZOU A,SHEN X,ZHANG X,et al.Neutral Color CorrectionAlgorithm for Color Transfer Between Multicolor Images[C]//Advances in Graphic Communication,Printing and Packaging Technology and Materials.2021:176-182.
[9]YOO S,BAHNG H,CHUNG S,et al.Coloring With LimitedData:Few-Shot Colorization via Memory-Augmented Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:11283-11292.
[10]VITORIA P,RAAD L,BALLESTER C.ChromaGAN:Adversarial Picture Colorization with Semantic Class Distribution[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2020:2445-2454.
[11]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative Adversarial Nets.[J].arXiv:1411.1784,2014.
[12]RUSSAKOVSKY O,DENG J,SU H,et al.ImageNet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[13]HE K,GKIOXARI G,DOLLÁR P,et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969.
[14]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[15]ZHU J Y,PHILIPP K,SHECHTMAN E,et al.Generative visual manipulation on the natural image manifold[C]//European Conference on Computer Vision.Cham:Springer,2016:597-613.
[16]CAESAR H,UIJLINGS J,FERRARI V.COCO-Stuff:Thingand Stuff Classes in Context[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:1209-1218.
[17]OBUKHOV A,KRASNYANSKIY M.Quality Assessment Me-thod for GAN Based on Modified Metrics Inception Score and Fréchet Inception Distance[C]//Proceedings of the Computational Methods in Systems and Software.2020:102-114.
[18]ZHANG R,ISOLA P,EFROS A A,et al.The Unreasonable Ef-fectiveness of Deep Features as a Perceptual Metric[C]//Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:586-595.
[19]WANG Q,WU B,ZHU P,et al.Eca-net:Efficient channel attention for deep convolutional neural networks[J].arXiv:1910.03151,2020.
[20]JIE H,LI S,SAMUEL A,et al.Squeeze-and-Excitation Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:7132-7147.
[21]YANG Y B.SA-Net:Shuffle Attention for Deep Convolutional Neural Networks[C]//2021 IEEE International Conference on Acoustics,Speech and Signal Processing(IICASSP 2021).2021:2235-2239.
[22]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[23]LIU H,LIU F,FAN X,et al.Polarized self-attention:Towards high-quality pixel-wise regression[J].arXiv:2107.00782,2021.

相关文章 15

[1]	董永峰, 黄港, 薛婉若, 李林昊. 融合IRT的图注意力深度知识追踪模型 Graph Attention Deep Knowledge Tracing Model Integrated with IRT 计算机科学, 2023, 50(3): 173-180. https://doi.org/10.11896/jsjkx.211200134
[2]	华晓凤, 冯娜, 于俊清, 何云峰. 基于规则推理的足球视频任意球射门事件检测 Shooting Event Detection of Free Kick in Soccer Video Based on Rule Reasoning 计算机科学, 2023, 50(3): 181-190. https://doi.org/10.11896/jsjkx.220300062
[3]	梅鹏程, 杨吉斌, 张强, 黄翔. 一种基于三维卷积的声学事件联合估计方法 Sound Event Joint Estimation Method Based on Three-dimension Convolution 计算机科学, 2023, 50(3): 191-198. https://doi.org/10.11896/jsjkx.220500259
[4]	白雪飞, 马亚楠, 王文剑. 基于特征融合的边缘引导乳腺超声图像分割方法 Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion 计算机科学, 2023, 50(3): 199-207. https://doi.org/10.11896/jsjkx.211200294
[5]	张卫良, 陈秀宏. 跨层融合和感受野扩增的SSD目标检测算法 SSD Object Detection Algorithm with Cross-layer Fusion and Receptive Field Amplification 计算机科学, 2023, 50(3): 231-237. https://doi.org/10.11896/jsjkx.211100281
[6]	陈亮, 王璐, 李生春, 刘昌宏. 基于深度学习的可视化仪表板生成技术研究 Study on Visual Dashboard Generation Technology Based on Deep Learning 计算机科学, 2023, 50(3): 238-245. https://doi.org/10.11896/jsjkx.230100064
[7]	张译, 吴秦. 特征增强损失与前景注意力人群计数网络 Crowd Counting Network Based on Feature Enhancement Loss and Foreground Attention 计算机科学, 2023, 50(3): 246-253. https://doi.org/10.11896/jsjkx.220100219
[8]	应宗浩, 吴槟. 深度学习模型的后门攻击研究综述 Backdoor Attack on Deep Learning Models:A Survey 计算机科学, 2023, 50(3): 333-350. https://doi.org/10.11896/jsjkx.220600031
[9]	邹芸竹, 杜圣东, 滕飞, 李天瑞. 一种基于多模态深度特征融合的视觉问答模型 Visual Question Answering Model Based on Multi-modal Deep Feature Fusion 计算机科学, 2023, 50(2): 123-129. https://doi.org/10.11896/jsjkx.211200303
[10]	王鹏宇, 台文鑫, 刘芳, 钟婷, 罗绪成, 周帆. 基于数据增强的自监督飞行航迹预测 Self-supervised Flight Trajectory Prediction Based on Data Augmentation 计算机科学, 2023, 50(2): 130-137. https://doi.org/10.11896/jsjkx.211200016
[11]	郭楠, 李婧源, 任曦. 基于深度学习的刚体位姿估计方法综述 Survey of Rigid Object Pose Estimation Algorithms Based on Deep Learning 计算机科学, 2023, 50(2): 178-189. https://doi.org/10.11896/jsjkx.211200164
[12]	李俊林, 欧阳智, 杜逆索. 基于改进区域候选网络的场景文本检测 Scene Text Detection with Improved Region Proposal Network 计算机科学, 2023, 50(2): 201-208. https://doi.org/10.11896/jsjkx.211000191
[13]	华杰, 刘学亮, 赵烨. 基于特征融合的小样本目标检测 Few-shot Object Detection Based on Feature Fusion 计算机科学, 2023, 50(2): 209-213. https://doi.org/10.11896/jsjkx.220500153
[14]	商迪, 吕彦锋, 乔红. 受人脑中记忆机制启发的增量目标检测方法 Incremental Object Detection Inspired by Memory Mechanisms in Brain 计算机科学, 2023, 50(2): 267-274. https://doi.org/10.11896/jsjkx.220900212
[15]	梁佳利, 华保健, 苏少博. 融合循环划分的张量指令生成优化 Tensor Instruction Generation Optimization Fusing with Loop Partitioning 计算机科学, 2023, 50(2): 374-383. https://doi.org/10.11896/jsjkx.220300147

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

极化自注意力约束颜色溢出的图像自动上色

Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0