计算机科学 ›› 2022, Vol. 49 ›› Issue (2): 123-133.doi: 10.11896/jsjkx.211000007

• 计算机视觉:理论与应用 • 上一篇    下一篇

基于深度学习的视频超分辨率重构进展综述

冷佳旭1,2, 王佳1, 莫梦竟成1, 陈泰岳1, 高新波1   

  1. 1 重庆邮电大学图像认知重庆市重点实验室 重庆400065
    2 南京理工大学江苏省社会安全图像与视频理解重点实验室 南京210094
  • 收稿日期:2021-09-30 修回日期:2021-11-04 出版日期:2022-02-15 发布日期:2022-02-23
  • 通讯作者: 高新波(gaoxb@cqupt.edu.cn)
  • 作者简介:lengjx@cqupt.edu.cn
  • 基金资助:
    国家自然科学基金(62036007,62050175,62102057);重庆市教委科学技术研究项目(KJQN-202100627)

Survey on Video Super-resolution Based on Deep Learning

LENG Jia-xu1,2, WANG Jia1, MO Meng-jing-cheng1, CHEN Tai-yue1, GAO Xin-bo1   

  1. 1 Key Laboratory of Image Cognition,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
    2 Jiangsu Key Laboratory of Image and Video Understanding for Social Safety,Nanjing University of Science and Technology,Nanjing 210094,China
  • Received:2021-09-30 Revised:2021-11-04 Online:2022-02-15 Published:2022-02-23
  • About author:LENG Jia-xu,born in 1989,Ph.D.His main research interests include object detection,face super-resolution,person re-identification and video anomaly detection.
    GAO Xin-bo,born in 1972,Ph.D,professor,Ph.D supervisor.His main research interests include artificial intelligence,machine learning,computer vision and pattern recognition.
  • Supported by:
    National Natural Science Foundation of China(62036007,62050175,62102057) and Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN-202100627).

摘要: 视频超分辨率是根据给定的低分辨率视频序列恢复其对应的高分辨率视频帧的过程。近年来,VSR在深度学习的驱动下取得了重大突破。为了进一步促进VSR的发展,文中对基于深度学习的VSR算法进行了归类、分析和比较。首先,根据网络结构将现有方法分为两大类,即基于迭代网络的VSR和基于递归网络的VSR,并对比分析了不同网络模型的优缺点。然后,全面介绍了VSR数据集,并在一些常用的公共数据集上对已有算法进行了总结和比较。最后,对VSR算法中的关键问题进行了分析,并对其应用前景进行了展望。

关键词: 卷积神经网络, 深度学习, 视频超分辨率, 帧间信息

Abstract: Video super-resolution (VSR) aims to reconstruct a high-resolution video from its corresponding low-resolution version.Recently,VSR has made great progress driven by deep learning.In order to further promote VSR,this survey makes a comprehensive summary of VSR,and makes a taxonomy,analysis and comparison of existing algorithms.Firstly,since different frameworks are very important for VSR,we group the VSR approaches into two categories according to different frameworks:iterative- and recurrent-network based VSR approaches.The advantages and disadvantages of different networks are further compared and analyzed.Secondly,we comprehensively introduce the VSR datasets,summarize existing algorithms and further compare these algorithms on some benchmark datasets.Finally,the key challenges and the application of VSR methods are analyzed and prospected.

Key words: Convolutional neural network, Deep learning, Inter-frame information, Video super-resolution

中图分类号: 

  • TP183
[1]HARRIS J L.Diffraction and resolving power[J].JOSA,1964,54(7):931-936.
[2]CAPEL D,ZISSERMAN A.Super-resolution enhancement oftext image sequences[C]//Proceedings 15th International Conference on Pattern Recognition.2000:600-605.
[3]SCHULTZ R R,STEVENSON R L.Extraction of high-resolution frames from video sequences[J].IEEE Transactions on Image Processing,1996,5(6):996-1011.
[4]BORMAN S,STEVENSON R L.Simultaneous multi-frameMAP super-resolution video enhancement using spatio-temporal priors[C]//Proceedings 1999 International Conference on Image Processing (Cat.99CH36348).1999:469-473.
[5]GUNTURK B K,ALTUNBASAK Y,MERSEREAU R.Baye-sian resolution-enhancement framework for transform-coded video[C]//Proceedings 2001 International Conference on Image Processing (Cat.No.01CH37205).2001:41-44.
[6]PATTI A J,SEZAN M I,TEKALP A M.Superresolution video reconstruction with arbitrary sampling lattices and nonzero aperture time[J].IEEE Transactions on Image Processing,1997,6(8):1064-1076.
[7]KAPPELER A,YOO S,DAI Q,et al.Video super-resolutionwith convolutional neural networks[J].IEEE Transactions on Computational Imaging,2016,2(2):109-122.
[8]XUE T,CHEN B,WU J,et al.Video enhancement with task-oriented flow[J].International Journal of Computer Vision,2019,127(8):1106-1125.
[9]HARIS M,SHAKHNAROVICH G,UKITA N.Recurrentback-projection network for video super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3897-3906.
[10]LI F,BAI H,ZHAO Y.Learning a deep dual attention network for video super-resolution[J].IEEE Transactions on Image Processing,2020,29:4474-4488.
[11]WANG Z,CHEN J,HOI S C H.Deep learning for image super-resolution:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020:3365-3387.
[12]SINGH A,SINGH J.Survey on single image based super-resolution—implementation challenges and solutions[J].Multimedia Tools and Applications,2020,79(3):1641-1672.
[13]YANG W,ZHANG X,TIAN Y,et al.Deep learning for single image super-resolution:A brief review[J].IEEE Transactions on Multimedia,2019,21(12):3106-3121.
[14]DAITHANKAR M V,RUIKAR S D.Video Super Resolution:A Review[C]//ICDSMLA.2020:488-495.
[15]WU Y,FAN G H.Survey of Super-Resolution ReconstructionTechniques for Video Sequences[J].Computer Engineering & Software,2017,38(4):154-160.
[16]LIU H,RUAN Z,ZHAO P,et al.Video super resolution based on deep learning:A comprehensive survey[J].arXiv:2007.12928,2020.
[17]DONG C,LOY C C,HE K,et al.Image super-resolution using deep convolutional networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,38(2):295-307.
[18]DONG C,LOY C C,TANG X.Accelerating the super-resolution convolutional neural network[C]//European Conference on Computer Vision.2016:391-407.
[19]KIM J,LEE J K,LEE K M.Accurate image super-resolutionusing very deep convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1646-1654.
[20]SHI W,CABALLERO J,HUSZAR F,et al.Real-time singleimage and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1874-1883.
[21]ZHANG Y,LI K,LI K,et al.Image super-resolution using very deep residual channel attention networks[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:286-301.
[22]BAO W,LAI W S,ZHANG X,et al.Memc-net:Motion estimation and motion compensation driven neural network for video interpolation and enhancement[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019:933-948.
[23]JO Y,OH S W,KANG J,et al.Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3224-3232.
[24]TIAN Y,ZHANG Y,FU Y,et al.Tdan:Temporally-deformablealignment network for video super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:3360-3369.
[25]CABALLERO J,LEDIG C,AITKEN A,et al.Real-time videosuper-resolution with spatio-temporal networks and motion compensation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4778-4787.
[26]GUO J,CHAO H.Building an end-to-end spatial-temporal con-volutional network for video super-resolution[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017:4053-4060.
[27]YI P,WANG Z,JIANG K,et al.Omniscient Video Super-Resolution[J].arXiv:2103.15683,2021.
[28]CHAN K C K,WANG X,YU K,et al.BasicVSR:The searchfor essential components in video super-resolution and beyond[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:4947-4956.
[29]LUCAS B.An Iterative Image Registration Technique with an Application to Stereo Vision (DARPA)[J].Proc. IJCAI,1981,81(3):674-679.
[30]DRULEA M,NEDEVSCHI S.Total variation regularization of local-global optical flow[C]//2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).2011:318-323.
[31]DOSOVITSKIY A,FISCHER P,ILG E,et al.Flownet:Lear-ning optical flow with convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:2758-2766.
[32]ILG E,MAYER N,SAIKIA T,et al.Flownet 2.0:Evolution of optical flow estimation with deep networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2462-2470.
[33]RANJAN A,BLACK M J.Optical flow estimation using a spatial pyramid network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4161-4170.
[34]TAO X,GAO H,LIAO R,et al.Detail-revealing deep video super-resolution[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:4472-4480.
[35]WANG L,GUO Y,LIN Z,et al.Learning for video super-resolution through HR optical flow estimation[C]//Asian Confe-rence on Computer Vision.2018:514-529.
[36]BARE B,YAN B,MA C,et al.Real-time video super-resolution via motion convolution kernel estimation[J].Neurocomputing,2019,367:236-245.
[37]WANG Z,YI P,JIANG K,et al.Multi-memory convolutionalneural network for video super-resolution[J].IEEE Transactions on Image Processing,2018,28(5):2530-2544.
[38]KALAROT R,PORIKLI F.Multiboot vsr:Multi-stage multi-reference bootstrapping for video super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019:2060-2069.
[39]SHI X J,CHEN Z,WANG H,et al.Convolutional LSTM network:A machine learning approach for precipitation nowcasting[C]//Advances in Neural Information Processing Systems.2015:802-810.
[40]IRANI M,PELEG S.Improving resolution by image registration[J].CVGIP:Graphical Models and Image Processing,1991,53(3):231-239.
[41]LIU D,WANG Z,FAN Y,et al.Robust video super-resolution with learned temporal dynamics[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2507-2515.
[42]KIM T H,SAJJADI M S M,HIRSCH M,et al.Spatio-temporal transformer network for video restoration[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:106-122.
[43]LIU H,ZHAO P,RUAN Z,et al.Large motion video super-re-solution with dual subnet and multi-stage communicated upsampling[J].arXiv:2103.11744,2021.
[44]DAI J,QI H,XIONG Y,et al.Deformable convolutional net-works[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:764-773.
[45]ZHU X,HU H,LIN S,et al.Deformable convnets v2:More deformable,better results[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9308-9316.
[46]WANG X,CHAN K C K,YU K,et al.Edvr:Video restoration with enhanced deformable convolutional networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019:1954-1963.
[47]WANG H,SU D,LIU C,et al.Deformable non-local networkfor video super-resolution[J].IEEE Access,2019,7:177734-177744.
[48]SONG H,XU W,LIU D,et al.Multi-Stage Feature Fusion Network for Video Super-Resolution[J].IEEE Transactions on Image Processing,2021,30:2923-2934.
[49]LUCAS A,LOPEZ-TAPIA S,MOLINA R,et al.Generative adversarial networks and perceptual losses for video super-resolution[J].IEEE Transactions on Image Processing,2019,28(7):3312-3327.
[50]YI P,WANG Z,JIANG K,et al.Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:3106-3115.
[51]ISOBE T,LI S,JIA X,et al.Video super-resolution with temporal group attention[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:8008-8017.
[52]LI W,TAO X,GUO T,et al.Mucan:Multi-correspondence aggregation network for video super-resolution[C]//European Conference on Computer Vision.2020:335-351.
[53]SAJJADI M S M,VEMULAPALLI R,BROWN M.Frame-recurrent video super-resolution[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6626-6634.
[54]YIN B,LIN C,TAN W.Frame and feature-context video super-resolution[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33(1):5597-5604.
[55]ZHU X,LI Z,ZHANG X Y,et al.Residual invertible spatio-temporal network for video super-resolution[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:5981-5988.
[56]JACOBSEN J H,SMEULDERS A,OYALLON E.i-revnet:Deep invertible networks[J].arXiv:1802.07088,2018.
[57]ISOBE T,JIA X,GU S,et al.Video super-resolution with recurrent structure-detail network[C]//European Conference on Computer Vision.2020:645-660.
[58]HUANG Y,WANG W,WANG L.Bidirectional recurrent con-volutional networks for multi-frame super-resolution[J].Advances in Neural Information Processing Systems,2015,28:235-243.
[59]GUO J,CHAO H.Building an end-to-end spatial-temporal con-volutional network for video super-resolution[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017:4053-4060.
[60]LI D,LIU Y,WANG Z.Video super-resolution using non-simultaneous fully recurrent convolutional network[J].IEEE Tran-sactions on Image Processing,2018,28(3):1342-1355.
[61]FUOLI D,GU S,TIMOFTE R.Efficient video super-resolution through recurrent latent space propagation[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).IEEE,2019:3476-3485.
[62]LIAO R,TAO X,LI R,et al.Video super-resolution via deep draft-ensemble learning[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:531-539.
[63]NAH S,BAIK S,HONG S,et al.Ntire 2019 challenge on video deblurring and super-resolution:Dataset and study[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019:1996-2005.
[64]LIU C,SUN D.On Bayesian Adaptive Video Super Resolution[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,36(2):346-360.
[65]ZENG H,ZHANG X,YU Z,et al.SR-ITM-GAN:Learning 4K UHD HDR With a Generative Adversarial Network[J].IEEE Access,2020,8:182815-182827.
[66]HE Z,HUANG H,JIANG M,et al.FPGA-based real-time super-resolution system for ultra high definition videos[C]//2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).IEEE,2018:181-188.
[67]LIU Z,CUI C.A New Low Bit-Rate Coding Scheme for Ultra High Definition Video Based on Super-Resolution Reconstruction[C]//2018 IEEE International Conference on Computer and Communication Engineering Technology (CCET).IEEE,2018:325-329.
[68]KIM Y,CHOI J S,KIM M.A real-time convolutional neuralnetwork for super-resolution on FPGA with applications to 4K UHD 60 fps video services[J].IEEE Transactions on Circuits and Systems for Video Technology,2018,29(8):2521-2534.
[69]YANG Y,BI P,LIU Y.License plate image super-resolutionbased on convolutional neural network[C]//2018 IEEE 3rd International Conference on Image,Vision and Computing (ICIVC).IEEE,2018:723-727.
[70]GHONEIM M,REHAN M,OTHMAN H.Using super resolution to enhance license plates recognition accuracy[C]//2017 12th International Conference on Computer Engineering and Systems (ICCES).2017:515-518.
[71]MEHREGAN K,AHMADYFARD A,KHOSRAVI H.Super-resolution of license-plates using frames of low-resolution video[C]//2019 5th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS).2019:1-6.
[72]NISHIBORI K,TAKAHASHI T,DEGUCHI D,et al.Exem-plar-based human body super-resolution for surveillance camera systems[C]//2014 International Conference on Computer Vision Theory and Applications (VISAPP).IEEE,2014:115-121.
[73]LEE Y,YUN J W,HONG Y,et al.Accurate license plate recognition and super-resolution using a generative adversarial networks on traffic surveillance video[C]//2018 IEEE InternationalConference on Consumer Electronics-Asia (ICCE-Asia).IEEE,2018:1-4.
[74]REN S,LI J,GUO K,et al.Medical video super-resolution based on asymmetric back-projection network with multilevel error feedback[J].IEEE Access,2021,9:17909-17920.
[75]BONANNO D,DEBONO C J.A Medical Video Coding Scheme with Preserved Diagnostic Quality[C]//2019 IEEE Global Communications Conference (GLOBECOM).IEEE,2019:1-6.
[76]XIAO A,WANG Z,WANG L,et al.Super-resolution for “Jilin-1” satellite video imagery via a convolutional network[J].Sensors,2018,18(4):1194.
[77]GARCIA D C,FONSECA T A,QUEIROZ R L D.Example-based super-resolution for point-cloud video[C]//2018 25th IEEE International Conference on Image Processing (ICIP).IEEE,2018:2959-2963.
[78]MATSUSHITA Y,KAWASAKI H,ONO S,et al.Simultaneous deblur and super-resolution technique for video sequence captured by hand-held video camera[C]//2014 IEEE International Conference on Image Processing (ICIP).IEEE,2014:4562-4566.
[1] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[3] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[4] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[5] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[6] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[9] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[10] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[11] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[12] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[13] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[14] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[15] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!