计算机科学 ›› 2023, Vol. 50 ›› Issue (9): 168-175.doi: 10.11896/jsjkx.221000100

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于上下文门控残差和多尺度注意力的图像重照明网络

王威, 杜响成, 金城   

  1. 复旦大学计算机科学技术学院 上海 200438
  • 收稿日期:2022-10-13 修回日期:2023-04-06 出版日期:2023-09-15 发布日期:2023-09-01
  • 通讯作者: 金城(jc@fudan.edu.cn)
  • 作者简介:(20212010100@fudan.edu.cn)
  • 基金资助:
    国家重点研发计划(2019YFB2102800)

Image Relighting Network Based on Context-gated Residuals and Multi-scale Attention

WANG Wei, DU Xiangcheng, JIN Cheng   

  1. School of Computer Science,Fudan University,Shanghai 200438,China
  • Received:2022-10-13 Revised:2023-04-06 Online:2023-09-15 Published:2023-09-01
  • About author:WANG Wei,born in 1995,postgra-duate,is a member of China Computer Federation.His main research interests include image processing and visual positioning.
    JIN Cheng,born in 1978,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include computer vision and multimedia information retrieval.
  • Supported by:
    National Key R & D Program of China (2019YFB2102800).

摘要: 图像重照明普遍应用于图像编辑和数据增强等任务。现有图像重照明方法去除和重建复杂场景下的阴影时,存在阴影形状估计不准确、物体纹理模糊和结构变形等缺陷。针对以上问题,提出了基于上下文门控残差和多尺度注意力的图像重照明网络。上下文门控残差通过聚合局部和全局的空间上下文信息获取像素的长程依赖,保持阴影方向和照明方向的一致性。此外,利用门控机制有效提高网络对纹理和结构的恢复能力。多尺度注意力通过迭代提取和聚合不同尺度的特征,在不损失分辨率的基础上增大感受野,它通过串联通道注意力和空间注意力激活图像中重要的特征,并抑制无关特征的响应。文中还提出了照明梯度损失,它通过有效学习各方向照明梯度,获得了视觉感知效果更好的图像。实验结果表明,与现有的最优方法相比,所提方法在PSNR指标和SSIM指标上分别提升了7.47%和12.37%。

关键词: 图像重照明, 上下文信息, 门控机制, 照明梯度, 注意力

Abstract: Image relighting is commonly used in image editing and data augmentation tasks.Existing image relighting methods suffer from estimating accurate shadows and obtaining consistent structures and clear texture when removing and rendering sha-dows in complex scenes.To address these issues,this paper proposes an image relighting network based on context-gated resi-duals and multiscale attention.Contextual gating residuals capture the long-range dependencies of pixels by aggregating local and global spatial context information,which maintains the consistency of shadow and lighting direction.Besides,gating mechanisms can effectively improve the network's ability to recover textures and structures.Multiscale attention increases the receptive field without losing resolution by iteratively extracting and aggregating features of different scales.It activates important features by concatenating channel attention and spatial attention,and suppresses the responses of irrelevant features.In this paper,lighting gradient loss is also proposed to obtain satisfactory visual images through efficiently learning the lighting gradients in all directions.Experimental results show that,compared with the current state-of-the-art methods,the proposed method improves PSNR and SSIM by 7.47% and 12.37%,respectively.

Key words: Image relighting, Contextual information, Gating mechanism, Lighting gradient, Attention

中图分类号: 

  • TP391
[1]HOU Q,ZHOU D,FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13713-13722.
[2]ZHANG E,COHEN M F,CURLESS B.Emptying,refurni-shing,and relighting indoor spaces[J].ACM Transactions on Graphics(TOG),2016,35(6):1-14.
[3]DUCHENE S,RIANT C,CHAURASIA G,et al.Multi-view intrinsic images of outdoors scenes with an application to religh-ting[J].ACM Transactions on Graphics,2015,34(5):1-16.
[4]KARSCH K,HEDAU V,FORSYTH D,et al.Rendering synthetic objects into legacy photographs[J].ACM Transactions on Graphics(TOG),2011,30(6):1-12.
[5]PEERS P,MAHAJAN D K,LAMOND B,et al.Compressivelight transport sensing[J].ACM Transactions on Graphics(TOG),2009,28(1):1-18.
[6]REDDY D,RAMAMOORTHI R,CURLESS B.Frequency-space decomposition and acquisition of light transport under spatially varying illumination[C]//European Conference on Computer Vision.Berlin:Springer,2012:596-610.
[7]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Compu-ter-assisted Intervention.Cham:Springer,2015:234-241.
[8]WANG L W,SIU W C,LIU Z S,et al.Deep relighting networks for image light source manipulation[C]//European Conference on Computer Vision.Cham:Springer,2020:550-567.
[9]HU Z,HUANG X,LI Y,et al.SA-AE for any-to-any relighting[C]//European Conference on Computer Vision.Cham:Sprin-ger,2020:535-549.
[10]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1125-1134.
[11]WANG Y,LU T,ZHANG Y,et al.Multi-scale self-calibratednetwork for image light source transfer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:252-259.
[12]KUBIAK N,MUSTAFA A,PHILLIPSON G,et al.SILT:Self-supervised Lighting Transfer Using Implicit Image Decomposition[J].arXiv:2110.12914,2021.
[13]YAZDANI A,GUO T,MONGA V.Physically inspired densefusion networks for relighting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:497-506.
[14]YANG H H,CHEN W T,LUO H L,et al.Multi-modal bifurcated network for depth guided image relighting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:260-267.
[15]El HELOU M,ZHOU R,SUSSTRUNK S,et al.NTIRE 2021 depth guided image relighting challenge[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:566-577.
[16]JIANG Y,GONG X,LIU D,et al.Enlightengan:Deep light enhancement without paired supervision[J].IEEE Transactions on Image Processing,2021,30:2340-2349.
[17]LIU J J,HOU Q,CHENG M M,et al.Improving convolutional networks with self-calibrated convolutions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10096-10105.
[18]HENDRYCKS D,GIMPEL K.Gaussian error linear units(gelus)[J].arXiv:1606.08415,2016.
[19]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[20]LAI W S,HUANG J B,AHUJA N,et al.Fast and accurateimage super-resolution with deep laplacian pyramid networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(11):2599-2613.
[21]HELOU M E,ZHOU R,BARTHAS J,et al.VIDIT:Virtualimage dataset for illumination transfer[J].arXiv:2005.05460,2020.
[22]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[23]PUTHUSSERY D,PANIKKASSERIL S H,KURIAKOSE M,et al.Wdrn:A wavelet decomposed relightnet for image religh-ting[C]//European Conference on Computer Vision.Cham:Springer,2020:519-534.
[24]LIU X,MA Y,SHI Z,et al.Griddehazenet:Attention-basedmulti-scale network for image dehazing[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:7314-7323.
[25]WEI C,WANG W,YANG W,et al.Deep retinex decomposition for low-light enhancement[J].arXiv:1808.04560,2018.
[26]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612.
[27]ZHANG R,ISOLA P,EFROS A A,et al.The unreasonable effectiveness of deep features as a perceptual metric[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:586-595.
[28]EI HELOU M,ZHOU R,SUSSTRUNK S,et al.AIM 2020:Scene relighting and illumination estimation challenge[C]//European Conference on Computer Vision.Cham:Springer,2020:499-518.
[29]TAO X,GAO H,SHEN X,et al.Scale-recurrent network for deep image deblurring[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:8174-8182.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!