基于运动估计与时空结合的多帧融合去雨网络

doi:10.11896/jsjkx.210100104

计算机科学 ›› 2021, Vol. 48 ›› Issue (5): 170-176.doi: 10.11896/jsjkx.210100104

• 计算机图形学&多媒体 • 上一篇下一篇

基于运动估计与时空结合的多帧融合去雨网络

孟祥玉¹, 薛昕惟^1,2, 李汶霖¹, 王祎^1,2

1 大连理工大学－立命馆大学国际信息与软件学院辽宁大连 116621
2 辽宁省泛在网络与服务软件重点实验室辽宁大连 116621

收稿日期:2021-01-13 修回日期:2021-03-30 出版日期:2021-05-15 发布日期:2021-05-09
通讯作者: 薛昕惟(xuexinwei@dlut.edu.cn)
基金资助:
国家自然科学基金(61806036,61976037);中央高校基本科研业务费资助(DUT19TD19)

Motion-estimation Based Space-temporal Feature Aggregation Network for Multi-frames Rain Removal

MENG Xiang-yu¹, XUE Xin-wei^1,2, LI Wen-lin¹, WANG Yi^1,2

1 DUT-RU International School of Information Science & Engineering of Dalian University of Technology,Dalian,Liaoning 116621,China
2 Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province,Dalian,Liaoning 116621,China

Received:2021-01-13 Revised:2021-03-30 Online:2021-05-15 Published:2021-05-09
About author:MENG Xiang-yu,born in 1998,postgraduate.His main research interests include computer vision and image processing.(lnmengxiangyu@mail.dlut.edu.cn)
XUE Xin-wei,born in 1984,Ph.D,lecturer,graduate supervisor,is a member of China Computer Federation.Her main research interests include machine learning and computer vision.
Supported by:
National Natural Science Foundation of China(61806036, 61976037) and Fundamental Research Funds for the Central Universities(DUT19TD19).

摘要/Abstract

摘要： 降雨天气会导致视觉质量下降,从而影响目标识别和追踪等视觉任务的处理效果。为了减小雨的影响,完成对运动视频背景细节的有效恢复,近年来相关研究者在视频去雨方向提出了很多方法。其中基于卷积神经网络的视频去雨方法使用最为广泛,它们大多采用单帧增强后多帧融合去雨的方式。但由于直接单帧增强使相邻帧之间部分像素的移动无法完成时间维度上的对齐,不能有效实现端到端的训练,因此丢失了大量细节信息,使得最终得到的去雨效果不尽人意。为有效解决上述问题,文中提出了一个基于运动估计与时空结合的多帧融合去雨网络(ME-Derain)。首先通过光流估计算法将相邻帧对齐到当前帧来有效利用时间信息;然后引入基于残差连接的编码器-解码器结构,结合与时间相关的注意力增强机制一起构成多帧融合网络来有效融合多帧信息;最后利用空间相关的多尺度增强模块来进一步增强去雨效果和得到最终的去雨视频。在多个数据集上的大量实验结果表明,所提算法优于现阶段大部分视频去雨算法,能够获得更好的去雨效果。

关键词: 光流, 卷积神经网络, 视频去雨, 视频增强

Abstract: Outdoor videos obtained under rainy weather cause visual quality degradation,which affects the processing effects of visual tasks such as object recognition and tracking.In order to enhance the quality of video and complete the effective recovery of the details in the motion video,many methods have been proposed in video rain removal.At this stage,most of the video rain removal methods based on convolutional neural networks employ single-frame enhancement and multi-frame fusion to remove rain.But the movement of some pixels between adjacent frames with direct enhancement is difficult to be completed in the temporal dimension.And the manner cannot effectively achieve end-to-end training,making the final result still relatively blurry and many detailed information losses.In order to effectively solve the above problems,this paper proposes a multi-frame fusion rain removal network based on the combination of motion estimation and space-temporal feature aggregation,ME-Derain for short.First,the optical flow estimation method is used to establish a reference frame to complete the alignment of adjacent frames,and then an encoder-decoder structure is introduced.The convolutional neural network connected by the residual connection and the time-related attention enhancement mechanism together form a multi-frame fusion network.Finally,the enhancement module related to the spatial sequence is used to obtain the rain removal video.A large number of experiments on different data sets show that the proposed method is better than most common methods at this stage and can obtain better rain removal effect.

Key words: Convolutional neural network, Optical flow, Video enhancement, Video rain removal

中图分类号:

TP391

孟祥玉, 薛昕惟, 李汶霖, 王祎. 基于运动估计与时空结合的多帧融合去雨网络[J]. 计算机科学, 2021, 48(5): 170-176. https://doi.org/10.11896/jsjkx.210100104

MENG Xiang-yu, XUE Xin-wei, LI Wen-lin, WANG Yi. Motion-estimation Based Space-temporal Feature Aggregation Network for Multi-frames Rain Removal[J]. Computer Science, 2021, 48(5): 170-176. https://doi.org/10.11896/jsjkx.210100104

参考文献

[1]YANG W,TAN R T,WANG S,et al.Single image deraining:From model-based to data-driven and beyond[J].arXiv:1912.07150,2019.
[2]WANG H,WU Y,LI M,et al.A survey on rain removal fromvideo and single image[J].arXiv:1909.08326,2019.
[3]BREWER N,LIU N.Using the shape characteristics of rain toidentify and remove rain from video[C]//Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR).Berlin,Heidelberg:Springer,2008:451-458.
[4]LIU J,YANG W,YANG S,et al.Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:3233-3242.
[5]SU S,DELBRACIO M,WANG J,et al.Deep Video Deblurring for Hand-Held Cameras[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Press,2017:1279-1288.
[6]LUO Y,XU Y,JI H.Removing Rain from a Single Image via Discriminative Sparse Coding[C]//Proceedings of IEEE International Conference on Computer Vision.Santiago:IEEE Press,2015:3397-3405.
[7]YANG W,TAN R T,FENG J,et al.Deep Joint Rain Detection and Removal from a Single Image[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Press,2017:1685-1694.
[8]ZHANG H,PATEL V M.Density-aware single image de-rai-ning using a multi-stream dense network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:695-704.
[9]FU X Y,SUN Q,HUANG Y,et al.Single Image De-raining Method Based on Deep Adjacently Connected Networks [J].Computer Science,2020,47(2):106-111.
[10]WANG M H,HE H J,LI C.Single image rain removal based on selective kernel convolution using a residual refine factor [J].Journal of Image and Graphics,2020,25(12):2484-2493.
[11]YOU S,TAN R,KAWAKAMI R,et al.Adherent RaindropModeling,Detection and Removal in Video[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,38(9):1721-1733.
[12]LI M,XIE Q,ZHAO Q,et al.Video rain streak removal by multi-scale convolutional sparse coding [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:6644-6653.
[13]KIM J H,LEE C,SIM J Y,et al.Single-image deraining using an adaptive nonlocal means filter[C]//Proceedings of IEEE International Conference on Image Processing.Paris:IEEE Press,2014:914-917.
[14]DOSOVITSKIY A,FISCHER P,ILG E,et al.FlowNet:Lear-ning Optical Flow with Convolutional Networks[C]//Procee-dings of IEEE International Conference on Computer Vision.Santiago:IEEE Press,2015:2758-2766.
[15]JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformer networks [C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Montreal:ACM,2015:2017-2025.
[16]XUE X,DING Y,MENG X,et al.Investigating Collaborative Layer Projection for Robust Rain Scene Modeling[J].IEEE Access,2020,8:161765-161775.
[17]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Trans on Image Processing,2004,13(4):600-612.
[18]CHEN J,TAN C H,HOU J,et al.Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:6286-6295.
[19]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[J].arXiv:1909.08326,2019.
[20]JIANG T X,HUANG T Z,ZhAO X L,et al.FastDeRain:A Novel Video Rain Streak Removal Method Using Directional Gradient Priors[J].IEEE Transactions on Image Processing,2018,28(4):2089-2102.
[21]HUYNH-THU Q,GHANBARI M.Scope of validity of PSNR in image/video quality assessment[J].Electronics Letters,2008,44(13):800-801.
[22]RANJAN A,BLACK M J.Optical Flow Estimation Using aSpatial Pyramid Network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Press,2017:4161-4170.
[23]SUN D,YANG X,LIU M Y,et al.PWC-Net:CNNs for Optical Flow Using Pyramid,Warping,and Cost Volume[C]//Procee-dings of IEEE Conference on Computer Vision and Pattern Re-cognition.Salt Lake City:IEEE Press,2018:8934-8943.
[24]HUI T W,TANG X,LOY C C.LiteFlowNet:A Lightweight Convolutional Neural Network for Optical Flow Estimation[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE Press,2018:8981-8989.
[25]MEISTER S,HUR J,ROTH S.Unflow:Unsupervised learning of optical flow with a bidirectional census loss [C]//Proceedings of the AAAI Conference on Artificial Intelligence.New Orleans:AAAI,2018,32(1):7251-7259.

相关文章 15

[1]	周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2]	李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[3]	陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[4]	朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[5]	檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[6]	金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[7]	张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[8]	戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮. 基于DNGAN的磁共振图像超分辨率重建算法 Super-resolution Reconstruction of MRI Based on DNGAN 计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[9]	刘月红, 牛少华, 神显豪. 基于卷积神经网络的虚拟现实视频帧内预测编码 Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network 计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179
[10]	徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[11]	杨玥, 冯涛, 梁虹, 杨扬. 融合交叉注意力机制的图像任意风格迁移 Image Arbitrary Style Transfer via Criss-cross Attention 计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236
[12]	杨健楠, 张帆. 一种结合双注意力机制和层次网络结构的细碎农作物分类方法 Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure 计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169
[13]	杨涵, 万游, 蔡洁萱, 方铭宇, 吴卓超, 金扬, 钱伟行. 基于步态分类辅助的虚拟IMU的行人导航方法 Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification 计算机科学, 2022, 49(6A): 759-763. https://doi.org/10.11896/jsjkx.211200148
[14]	孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[15]	吴子斌, 闫巧. 基于动量的映射式梯度下降算法 Projected Gradient Descent Algorithm with Momentum 计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于运动估计与时空结合的多帧融合去雨网络

Motion-estimation Based Space-temporal Feature Aggregation Network for Multi-frames Rain Removal

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0