基于潜在注意力的高性能视频超分辨率技术

doi:10.11896/jsjkx.221100156

Abstract

Abstract: To solve the problem of video super-resolution,the spatio-temporal correlation information in videos can be utilized,which is an effective method for reconstructing low resolution videos into high-resolution videos.Prior works mainly focus on utilizing motion compensation to capture temporal dependency in video generation,leading to inefficient stage-wise modeling strategies.Compared to motion compensation,attention model is more efficient in the search for spatio-temporal correlation.In this paper,we formulate a latent attention model for attention estimation with amortized variational inference and instantiate two effective attention modules for video super-resolution.Based on it,a novel deep network model,which can capture spatio-temporal correlations efficiently for video super-resolution and admit end-to-end learning,is presented.Extensive experiments on public video datasets demonstrate the superior performance of our approach over several state-of-the-art methods like SPMC,DUF-16L.

Key words: Super-resolution, Deep learning, Latent attention, Variational inference, Efficient

CLC Number:

TP391.41

WANG Yuji, DONG Haocheng, GONG Xueluan, CHEN Yanjiao. Efficient Video Super-Resolution with Latent Attention[J].Computer Science, 2023, 50(11A): 221100156-10.

References

[1]DONG C,LOY C C,HE K,et al.Image super-resolution using deep convolutional networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,38(2):295-307.
[2]LIANG M,WANG X.Semantic segmentation model for remote sensing images combining super resolution and domain adaptation[J].Chinese Journal of Computers,2022,45(12):2619-2636.
[3]HE P H,YU Y,XU C Y.Image super-resolution reconstruction network based on dynamic pyramid and subspace attention[J].Computer Science,2022,49(S2):423-430.
[4]WU J,YE X J,HUANG F,et al.A review of single image super-resolution reconstruction based on deep learning[J].Chinese Journal of Electronics,2022,50(9):2265-2294.
[5]CABALLERO J,LEDIG C,AITKEN A,et al.Real-time video super-resolution with spatio-temporal networks and motion compensation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2017:4778-4787.
[6]TAO X,GAO H,LIAO R,et al.Detail-revealing deep video super-resolution[J].arXiv:1704.02738.2017.
[7]KAPPELER A,YOO S,DAI Q,et al.Video super-resolutionwith convolutional neural networks[J].IEEE Transactions on Computational Imaging,2016,2(2):109-122.
[8]LIU D,WANG Z,FAN Y,et al.Robust video super-resolution with learned temporal dynamics[C]//IEEE International Conference on Computer Vision.2017:2507-2515.
[9]FU L H,SUN X W,ZHAO Y,et al.Fast video super-resolution reconstruction method based on motion feature fusion[J].Pattern Recognition and Artificial Intelligence,2019,32(11):1022-1031.
[10]SHI X J,CHEN Z,WANG H,et al.Convolutional lstm net-work:A machine learning approach for precipitation nowcasting[C]//Annual Conference on Neural Information Processing Systems.2015:802-810.
[11]FUOLI D,GU S,TIMOFTE R.Efficient video super-resolution through recurrent latent space propagation[C]//ICCVW.2019.
[12]DENG Y,KIM Y,CHIU J,et al.Latent alignment and variational attention[C]//Advances in Neural Information Proces-sing Systems.2018:9712-9724.
[13]WANG X,GIRSHICK R,GUPTA A,et al.Non-local neuralnetworks[C]//IEEE Conference on Computer Vision and Pattern Recognition.2018:7794-7803.
[14]WANG F,JIANG M,QIAN C,et al.Residual attention network for image classi?cation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2017:3156-3164.
[15]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[16]KIM J,LEE J K,LEE K M.Accurate image super-resolutionusing very deep convolutional networks[C]//IEEE Conference on Computer Vision and Pattern Recognition.2016:1646-1654.
[17]LEDIG C,THEIS L,HUSZ’AR F,et al.Photo-realistic single image super-resolution using a generative adversarial network[J].arXiv:1609.04802,2016.
[18]KIM J,LEE J K,LEE K M.Deeply-recursive convolutional network for image super-resolution[J].IEEE Conference on Computer Vision and Pattern Recognition,2016:1637-1645.
[19]TAI Y,YANG J,LIU X.Image super-resolution via deep recursive residual network[C]//IEEE Conference on Computer Vision and Pattern Recognition.2017.
[20]SHI W,CABALLERO J,HUSZ′AR F,et al.Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//IEEE Conference on Computer Vision and Pattern Recognition.2016:1874-1883.
[21]LIM B,SON S,KIM H,et al.Enhanced deep residual networks for single image super-resolution[C]//IEEE Conference on Computer Vision and Pattern Recognition Workshops.2017.
[22]PARK S J,SON H,CHO S,et al.Srfeat:Single image super-resolution with feature discrimination[C]//European Conference on Computer Vision.2018:439-455.
[23]TAI Y,YANG J,LIU X,et al.Memnet:A persistent memory network for image restoration[C]//IEEE Conference on Computer Vision and Pattern Recognition.2017:4539-4547.
[24]ZHANG Y,TIAN Y,KONG Y,et al.Residual dense network for image super-resolution[J].arXiv:1802.08797,2018.
[25]SAJJADI M S,VEMULAPALLI R,BROWN M.Frame-recurrent video super-resolution[J].arXiv:1801.04590,2018.
[26]HUANG Y,WANG W,WANG L.Bidirectional recurrent con-volutional networks for multi-frame super-resolution[C]//An-nual Conference on Neural Information Processing Systems.2015:235-243.
[27]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014.
[28]VINYALS O,TOSHEV A,BENGIO S,et al.Show and tell:A neural image caption generator[C]//IEEE Conference on Computer Vision and Pattern Recognition.2015:3156-3164.
[29]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[30]MNIH V,HEESS N,GRAVES A,et al.Recurrent models ofvisual attention[C]//Annual Conference on Neural Information Processing Systems.2014:2204-2212.
[31]LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[J].arXiv:1508.04025,2015.
[32]YAO L,TORABI A,CHO K,et al.Describing videos by exploiting temporal structure[C]//IEEE International Conference on Computer Vision.2015:4507-4515.
[33]WANG F,JIANG M,QIAN C,et al.Residual attention network for image classification[J].arXiv:1704.06904,2017.
[34]ZHOU C,NEUBIG G.Morphological in?ection generation with multi-space variational encoder-decoders[C]//CoNLL SIGMORPHON 2017 Shared Task:Universal Morphological Reinflection.2017:58-65.
[35]WILLIAM S,RONALD J.Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning[J].Machine Learning,1992,8(3/4):229-256.
[36]MAKHZANI A,SHLENS J,JAITLY N,et al.Adversarial autoencoders[J].arXiv:1511.05644,2015.
[37]LIU C,SUN D.A bayesian approach to adaptive video super resolution[C]//IEEE Conference on Computer Vision and Pattern Recognition.2011:209-216.
[38]HARMONIC I.Free 4K Demo Footage Center[OL].https://www.harmonicinc.com/free-4k-demo-footage/.
[39]PINSON M H.The consumer digital video library [J].IEEE Signal Processing Magazine,2013,30(4):172-174.
[40]SONG L,TANG X,ZHANG W,et al.The sjtu 4k video sequence dataset[C]//Quality of Multimedia Experience.IEEE,2013:34-35.
[41]DONG C,LOY C C,HE K,et al.Learning a deep convolutional network for image super-resolution[C]//European Conference on Computer Vision.Springer,2014:184-199.
[42]JO Y,WUG OH S,KANG J,et al.Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2018:3224-3232.

Related Articles 15

[1]	WANG Huaiqin, LUO Jian, WANG Haiyan. Feature Weight Perception-based Prediction of Virtual Network Function Resource Demands [J]. Computer Science, 2023, 50(9): 331-336.
[2]	LU Yuhan, CHEN Liquan, WANG Yu, HU Zhiyuan. Efficient Encrypted Image Content Retrieval System Based on SecureCNN [J]. Computer Science, 2023, 50(9): 26-34.
[3]	ZHAO Mingmin, YANG Qiuhui, HONG Mei, CAI Chuang. Smart Contract Fuzzing Based on Deep Learning and Information Feedback [J]. Computer Science, 2023, 50(9): 117-122.
[4]	LI Haiming, ZHU Zhiheng, LIU Lei, GUO Chenkai. Multi-task Graph-embedding Deep Prediction Model for Mobile App Rating Recommendation [J]. Computer Science, 2023, 50(9): 160-167.
[5]	HUANG Hanqiang, XING Yunbing, SHEN Jianfei, FAN Feiyi. Sign Language Animation Splicing Model Based on LpTransformer Network [J]. Computer Science, 2023, 50(9): 184-191.
[6]	ZHU Ye, HAO Yingguang, WANG Hongyu. Deep Learning Based Salient Object Detection in Infrared Video [J]. Computer Science, 2023, 50(9): 227-234.
[7]	WANG Yu, WANG Zuchao, PAN Rui. Survey of DGA Domain Name Detection Based on Character Feature [J]. Computer Science, 2023, 50(8): 251-259.
[8]	ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[9]	SONG Xinyang, YAN Zhiyuan, SUN Muyi, DAI Linlin, LI Qi, SUN Zhenan. Review of Talking Face Generation [J]. Computer Science, 2023, 50(8): 68-78.
[10]	WANG Xu, WU Yanxia, ZHANG Xue, HONG Ruize, LI Guangsheng. Survey of Rotating Object Detection Research in Computer Vision [J]. Computer Science, 2023, 50(8): 79-92.
[11]	ZHOU Ziyi, XIONG Hailing. Image Captioning Optimization Strategy Based on Deep Learning [J]. Computer Science, 2023, 50(8): 99-110.
[12]	ZHANG Xiao, DONG Hongbin. Lightweight Multi-view Stereo Integrating Coarse Cost Volume and Bilateral Grid [J]. Computer Science, 2023, 50(8): 125-132.
[13]	LI Kun, GUO Wei, ZHANG Fan, DU Jiayu, YANG Meiyue. Adversarial Malware Generation Method Based on Genetic Algorithm [J]. Computer Science, 2023, 50(7): 325-331.
[14]	CUI Yunsong, WU Ye, XU Xiaoke. Decoupling Analysis of Network Structure Affecting Propagation Effect [J]. Computer Science, 2023, 50(7): 368-375.
[15]	WANG Mingxia, XIONG Yun. Disease Diagnosis Prediction Algorithm Based on Contrastive Learning [J]. Computer Science, 2023, 50(7): 46-52.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Efficient Video Super-Resolution with Latent Attention

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0