计算机科学 ›› 2024, Vol. 51 ›› Issue (7): 197-205.doi: 10.11896/jsjkx.230400102

• 计算机图形学&多媒体 • 上一篇    下一篇

基于彩色图像高频信息引导的深度图超分辨率重建算法研究

李嘉莹1, 梁宇栋1,2, 李少吉1, 张昆鹏1, 张超1,2   

  1. 1 山西大学计算机与信息技术学院 太原 030006
    2 山西大学计算智能与中文信息处理教育部重点实验室 太原 030006
  • 收稿日期:2023-04-16 修回日期:2023-09-21 出版日期:2024-07-15 发布日期:2024-07-10
  • 通讯作者: 梁宇栋(liangyudong@sxu.edu.cn)
  • 作者简介:(202122407023@email.sxu.edu.cn)
  • 基金资助:
    国家自然科学基金(61802237,62272284);山西省基础研究计划项目(202203021221002,202203021211291);山西省自然科学基金(201901D211176,202103021223464);山西省高等学校科技创新项目(2019L0066);山西省科技重大专项计划(202101020101019);山西省重点研发计划(202102070301019);山西省科技创新青年人才团队项目(202204051001015)

Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images

LI Jiaying1, LIANG Yudong1,2, LI Shaoji1, ZHANG Kunpeng1, ZHANG Chao1,2   

  1. 1 School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing,Shanxi University,Taiyuan 030006,China
  • Received:2023-04-16 Revised:2023-09-21 Online:2024-07-15 Published:2024-07-10
  • About author:LI Jiaying,born in 1998,master.Her main research interests include compu-ter vision and image processing.
    LIANG Yudong,born in 1988,Ph.D,associate professor,is a member of CCF(No.85977M).His main research interests include computer vision,image processing,and deep learning-based applications.
  • Supported by:
    National Natural Science Foundation of China(61802237,62272284),Fundamental Research Program of Shanxi Province(202203021221002,202203021211291),Natural Science Foundation of Shanxi Province,China(201901D211176,202103021223464),Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi(2019L0066),Science and Technology Major Project of Shanxi Province,China(202101020101019),Key R & D Program of Shanxi Province(202102070301019) and Special Fund for Science and Technology Innovation Teams of Shanxi(202204051001015).

摘要: 深度图像信息是三维场景信息的重要组成部分,然而,由于采集设备的局限性和成像环境的多样性,深度传感器获取的深度图像往往分辨率较低、高频信息较少,限制了其在各种计算机视觉任务中的进一步应用。深度图超分辨率试图提高深度图的分辨率,是一项实用而有价值的任务。同一场景下的RGB图像分辨率高,纹理信息丰富,部分深度图超分辨率算法通过引入来自同一场景下的RGB图像提供指导信息,实现了算法性能的显著提升。然而,由于RGB图像和深度图之间的模态不一致,如何充分、有效地利用RGB信息辅助深度图像进行图像超分辨率重建仍然极具挑战。为此,提出了一种基于彩色图像高频信息引导的深度图超分辨率重建算法。具体地,设计了一个高频特征提取模块来自适应地学习彩色图像中的高频信息,以指导深度图边缘的重建。另外,设计了一个特征自注意力模块来获取特征之间的全局依赖,同时提取更深层次的特征,以帮助深度图细节信息的恢复。经过跨模态融合,重组深度图像特征和彩色图像引导特征,并使用多尺度特征融合模块融合不同尺度特征之间的空间结构信息,获取包含多级感受野的重建信息。最后,通过深度重建模块,恢复相应的高分辨率深度图。公开数据集上的实验结果表明所提方法在定量和定性两方面均优于对比方法,验证了所提方法的有效性。

关键词: 深度图超分重建, 深度学习, 跨模态特征融合, 高频信息, 自注意力机制

Abstract: Depth image information is an important part of 3D scene information.However,due to the limitations of acquisition equipment and the diversity of imaging environments,the depth images acquired by depth sensors often have low resolution and less high-frequency information,which limits their further applications in various computer vision tasks.Depth image super-resolution attempts to improve the resolution of depth images and is a practical and valuable task.The RGB image in the same scene has high resolution and rich texture information,and some depth image super-resolution algorithms achieve significant improvement in algorithm performance by introducing RGB images from the same scene to provide guidance information.However,due to the structural inconsistency between RGB images and depth maps,how to utilize RGB information fully and effectively is still extremely challenging.To this end,this paper proposes a depth image super-resolution guided by high-frequency information of co-lor images.Specifically,a high-frequency feature extraction module is designed to adaptively learn high-frequency information of color images to guide the reconstruction of depth map edges.In addition,a feature self-attention module is designed to capture the global dependencies between features,extract deeper features to help recover details in the depth image.After cross-modal fusion,the depth image features and color image-guided features are reconstructed,and the proposed multi-scale feature fusion module is used to fuse the spatial structure information between different scale features to obtain reconstruction information including multi-level receptive fields.Finally,through the depth reconstruction module,the corresponding high-resolution depth map is recovered.Comprehensive qualitative and quantitative experimental results on public datasets have demonstrated that the proposed method outperforms comparative methods,which verifies its effectiveness.

Key words: Depth image super-resolution reconstruction, Deep learning, Cross-modal fusion, High-frequency information, Self-attention mechanism

中图分类号: 

  • TP391
[1]RICHARDT C,STOLL C,DODGSON N A,et al.Coherent spatiotemporal filtering,upsampling and rendering of RGBZ videos[C]//Computer Graphics Forum.Oxford,UK:Blackwell Publishing Ltd,2012:247-256.
[2]HE K,SUN J,TANG X.Guided image filtering[J].IEEETransactions on Pattern Analysis and Machine Intelligence,2012,35(6):1397-1409.
[3]KOPF J,COHEN M F,LISCHINSKI D,et al.Joint bilateral upsampling[J].ACM Transactions on Graphics(ToG),2007,26(3):96-1-95-5.
[4]YANG Q,YANG R,DAVIS J,et al.Spatial-depth super resolution for range images[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2007:1-8.
[5]PARK J,KIM H,TAI Y W,et al.High quality depth map upsampling for 3D-TOF cameras[C]//2011 International Confe-rence on Computer Vision.IEEE,2011:1623-1630.
[6]FERSTL D,REINBACHER C,RANFTL R,et al.Image guideddepth upsampling using anisotropic total generalized variation[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:993-1000.
[7]YE X,SUN B,WANG Z,et al.PMBANet:Progressive multi-branch aggregation network for scene depth super-resolution[J].IEEE Transactions on Image Processing,2020,29:7427-7442.
[8]JIANG Z,YUE H,LAI Y K,et al.Deep edge map guided depthsuper resolution[J].Signal Processing:Image Communication,2021,90:116040.
[9]GUO C,LI C,GUO J,et al.Hierarchical features driven residual learning for depth map super-resolution[J].IEEE Transactions on Image Processing,2018,28(5):2545-2557.
[10]YANG F,YANG H,FU J,et al.Learning texture transformer network for image super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:5791-5800.
[11]CHEN Y,FAN H,XU B,et al.Drop an octave:Reducing spatial redundancy in convolutional neural networks with octave convolution[C]//Proceedings of the IEEE/CVF International Confe-rence on Computer Vision.2019:3435-3444.
[12]XIE J,FERIS R S,SUN M T.Edge-guided single depth image super resolution[J].IEEE Transactions on Image Processing,2015,25(1):428-438.
[13]FERSTL D,RUTHER M,BISCHOF H.Variational depth super resolution using example-based edge representations[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:513-521.
[14]DONG C,LOY C C,HE K,et al.Learning a deep convolutional network for image super-resolution[C]//Computer Vision-ECCV 2014:13th European Conference,Zurich,Switzerland,September 6-12,2014,Proceedings,Part IV 13.Springer International Publishing,2014:184-199.
[15]DONG C,LOY C C,TANG X.Accelerating the super-resolution convolutional neural network[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands,October 11-14,2016,Proceedings,Part II 14.Springer International Publishing,2016:391-407.
[16]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[17]KIM J,LEE J K,LEE K M.Accurate image super-resolutionusing very deep convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1646-1654.
[18]KIM J,LEE J K,LEE K M.Deeply-recursive convolutional network for image super-resolution[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1637-1645.
[19]RIEGLER G,RüTHER M,BISCHOF H.Atgv-net:Accuratedepth super-resolution[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands,October 11-14,2016,Proceedings,Part III 14.Springer International Publishing,2016:268-284.
[20]SONG X,DAI Y,QIN X.Deeply supervised depth map super-resolution as novel view synthesis[J].IEEE Transactions on Circuits and Systems for Video Technology,2018,29(8):2323-2336.
[21]SONG X,DAI Y,ZHOU D,et al.Channel attention based iterative residual learning for depth map super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:5631-5640.
[22]DIEBEL J,THRUN S.An application of markov random fields to range sensing[C]//Proceedings of the 18th International Conference on Neural Information Processing Systems.2005:291-298.
[23]GU S,ZUO W,GUO S,et al.Learning dynamic guidance fordepth image enhancement[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:3769-3778.
[24]LI Y,HUANG J B,AHUJA N,et al.Deep joint image filtering[C]//Computer Vision-ECCV 2016:14th European Confe-rence,Amsterdam,The Netherlands,October 11-14,2016,Proceedings,Part IV 14.Springer International Publishing,2016:154-169.
[25]LI Y,HUANG J B,AHUJA N,et al.Joint image filtering with deep convolutional networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(8):1909-1923.
[26]SU H,JAMPANI V,SUN D,et al.Pixel-adaptive convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:11166-11175.
[27]HUI T W,LOY C C,TANG X.Depth map super-resolution by deep multi-scale guidance[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands,October 11-14,2016,Proceedings,Part III 14.Springer International Publishing,2016:353-369.
[28]LUTIO R,D'ARONCO S,WEGNER J D,et al.Guided super-resolution as pixel-to-pixel transformation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:8829-8837.
[29]KIM B,PONCE J,HAM B.Deformable kernel networks forjoint image filtering[J].International Journal of Computer Vision,2021,129(2):579-600.
[30]HE L,ZHU H,LI F,et al.Towards fast and accurate real-world depth super-resolution:Benchmark dataset and baseline[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:9229-9238.
[31]SUN B,YE X,LI B,et al.Learning scene structure guidance via cross-task knowledge transfer for single depth super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:7792-7801.
[32]DENG X,DRAGOTTI P L.Deep convolutional neural network for multi-modal image restoration and fusion[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(10):3333-3348.
[33]ZHAO Z,ZHANG J,XU S,et al.Discrete cosine transform network for guided depth map super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:5697-5707.
[34]MALLICK A,ENGELHARDT A,BRAUN R,et al.Local Attention Guided Joint Depth Upsampling[C]//Vision,Modeling,and Visualization.The Eurographics Association,2022:135-1439.
[35]DONG J,PAN J,REN J S,et al.Learning spatially variant linearrepresentation models for joint filtering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(11):8355-8370.
[36]ZHONG Z,LIU X,JIANG J,et al.Deep attentional guidedimage filtering[J].arXiv:2112.06401,2023.
[37]YANG Y,CAO Q,ZHANG J,et al.CODON:on orchestrating cross-domain attentions for depth super-resolution[J].International Journal of Computer Vision,2022,130(2):267-284.
[38]ZHOU C,ZHOU Q W,CHEN H M,et al.Recurrent Scale-by-scale Feature Fusion Network for RGBD Salient Object Detection[J].Journal of Chinese Computer Systems,2023,44(10):2276-2283.
[39]MARCHAND E,UCHIYAMA H,SPINDLER F.Pose estimation for augmented reality:a hands-on survey[J].IEEE Transactions on Visualization and Computer Graphics,2015,22(12):2633-2651.
[40]LU S,REN X,LIU F.Depth enhancement via low-rank matrix completion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:3390-3397.
[41]HIRSCHMULLER H,SCHARSTEIN D.Evaluation of costfunctions for stereo matching[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2007:1-8.
[42]SCHARSTEIN D,PAL C.Learning conditional random fieldsfor stereo[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2007:1-8.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!