计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 228-238.doi: 10.11896/jsjkx.241200092

• 计算机图形学&多媒体 • 上一篇    下一篇

彩色图像引导高低频特征调制融合的深度图像超分辨率算法研究

徐晗智1, 李嘉莹2, 梁宇栋2,3, 魏巍2,3   

  1. 1 山西大学数学与统计学院 太原 030006
    2 山西大学计算机与信息技术学院 太原 030006
    3 山西大学计算智能与中文信息处理教育部重点实验室 太原 030006
  • 收稿日期:2024-12-11 修回日期:2025-03-25 出版日期:2025-06-15 发布日期:2025-06-11
  • 通讯作者: 梁宇栋(liangyudong@sxu.edu.cn)
  • 作者简介:(202200901161@email.sxu.edu.cn)
  • 基金资助:
    国家自然科学基金(61802237,62272284);山西省基础研究计划(自由探索类)项目(202203021221002,202203021211291);山西省自然科学基金(201901D211176,202103021223464);山西省高等学校科技创新项目(2019L0066);山西省科技重大专项计划(202101020101019);山西省重点研发计划(202102070301019);山西省科技创新青年人才团队项目(202204051001015)

Research on Depth Image Super-resolution Algorithm for High and Low Frequency Feature Modulation Fusion Guided by Color Images

XU Hanzhi1, LI Jiaying2, LIANG Yudong2,3, WEI Wei2,3   

  1. 1 School of Mathematical Science,Shanxi University,Taiyuan 030006,China
    2 School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
    3 Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing,Shanxi University,Taiyuan 030006,China
  • Received:2024-12-11 Revised:2025-03-25 Online:2025-06-15 Published:2025-06-11
  • About author:XU Hanzhi,born in 2004,undergra-duate.His main research interests include computer vision and image processing.
    LIANG Yudong,born in 1988,Ph.D,as-sociate professor,is a member of CCF(No.85977M).His main research interests include computer vision,image processing and deep learning-based applications.
  • Supported by:
    National Natural Science Foundation of China(61802237,62272284),Fundamental Research Program of Shanxi Province(202203021221002,202203021211291),Natural Science Foundation of Shanxi Province(201901D211176,202103021223464),Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi(2019L0066),Science and Technology Major Project of Shanxi Province(202101020101019),Key R&D Program of Shanxi Provine(202102070301019) and Special Fund for Science and Technology Innovation Teams of Shanxi(202204051001015).

摘要: 深度图像能够有效描述三维场景的信息,然而由于采集设备的局限性和不理想的成像环境,深度传感器获取的深度图像往往分辨率较低、高频信息较少,提高深度图像的分辨率具有重要意义。部分深度图超分辨率算法通过引入同一场景下的RGB图像为深度图超分辨率过程提供指导信息,显著提升了算法性能。如何充分、有效地利用RGB信息,改善深度图和RGB图像的模态不一致性,引导深度图超分辨率重建过程极具挑战。已有方法多关注于高频信息,忽略了低频全局的信息,影响了算法性能的提升。对此,提出了彩色图像引导的、高低频特征调制融合的深度图像超分辨率重建算法。具体地,设计了一个双分支特征提取模块,分别针对彩色图像和深度图像进行高低频特征提取,在各个分支采用CNN和Transformer分别提取局部高频和全局低频信息,通过构造双向调制模块,实现对彩色和深度图像高频信息之间和低频信息之间的双向转换与融合。模型经过不同模态不同频率内的双向调制及后续高低频信息的融合,充分挖掘深度图像与彩色图像之间的互补信息,使得基于彩色图像引导的深度超分辨率算法能够取得更好的重建效果。另外,利用可逆神经网络INN进行无损信息压缩,以更好地提取高频细节信息,采用四叉树注意力机制有效降低了Transformer提取全局信息的计算复杂度,提高了算法效率。在公开数据集上进行了实验,结果表明,所提方法在定量和定性两方面均优于对比方法,取得了较好的主观视觉效果。

关键词: 深度图超分重建, 混合特征, 双向调制, 四叉树注意力机制

Abstract: Depth images effectively describe the information of a 3D scene.However,the acquisition equipment and imaging environment limit the resolution and high-frequency information of the depth images acquired by depth sensors.It is imperative to improve the resolution of depth images.Some depth map super-resolution algorithms have significantly improved their performance by introducing RGB images from the same scene to provide guidance information for the depth map super-resolution process.The key challenge lies in effectively leveraging the RGB information to guide the depth map super-resolution reconstruction process,addressing the modal inconsistency between the depth map and RGB images.Existing methods primarily focus on high-frequency information,overlooking the low-frequency global information crucial for algorithm performance.To address these limitations,this paper proposes a novel color image-guided,high and low-frequency feature modulation fusion super-resolution reconstruction algorithm for depth maps.A two-branch feature extraction module extracts high and low frequency features from color and depth images,respectively.CNN and Transformer are used in each branch to extract local high frequency and global low frequency information.A two-way transformation and fusion between high frequency information and low frequency information of color and depth images is achieved by constructing a two-way modulation module.The model fully exploits the complementary information between the depth image and the color image.It uses a bidirectional modulation within different modes and different frequencies and the subsequent fusion of high and low-frequency information.The depth super-resolution algorithm based on the guidance of the color image can achieve better reconstruction results.The lossless information compression using reversible neural network INN extracts high-frequency detail information more effectively,and the quadtree attention mechanism reduces the computational complexity of the Transformer in extracting global information,improving the efficiency of the algorithm.The experimental results on the public datasets show that the proposed method outperforms the comparison methods in both quantitative and qualitative aspects,achieving better subjective visualization results.

Key words: Depth image super-resolution reconstruction, Hybrid features, Bidirectional modulation, Quadtree attention mechanics

中图分类号: 

  • TP391
[1]HAN X F,LAGA H,BENNAMOUN M.Image-Based 3D Object Reconstruction:State-of-the-Art and Trends in the Deep Learning Era[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(5):1578-1604.
[2]MEULEMAN A,BAEK S H,HEIDE F,et al.Single-Shot Monocular RGB-D Imaging Using Uneven Double Refraction[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:2462-2471.
[3]WANG F,PAN J,XU S,et al.Learning Discriminative Cross-Modality Features for RGB-D Saliency Detection[J].IEEE Transactions on Image Processing,2022,31:1285-1297.
[4]HE K,SUN J,TANG X.Guided Image Filtering[J].IEEETransactions on Pattern Analysis and Machine Intelligence,2013,35(6):1397-1409.
[5]RICHARDT C,STOLL C,DODGSON N A,et al.CoherentSpatiotemporal Filtering,Upsampling and Rendering of RGBZVideos[J].Computer Graphics Forum,2012,31(2),247-256.
[6]HE L,ZHU H,LI F,et al.Towards Fast and Accurate Real-World Depth Super-Resolution:Benchmark Dataset and Baseline[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:9225-9234.
[7]DONG X,YOKOYA N,WANG L,et al.Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution[C]//European Conference on Computer Vision.2022:1-18.
[8]TANG S,ZHANG J,ZHU S,et al.Quadtree Attention for Vision Transformers[J].arXiv:2201.02767,2022.
[9]HAN Y B,SHU F,SUN J T,et al.Image Super-Resolution Reconstruction Based on Mg-Gmres Algorithm[J].Chinese Journal of Computers,2007,30(6):1028-1034.
[10]ŠROUBEK F,FLUSSER J,CRISTÓBAL G.Super-Resolutionand Blind Deconvolution for Rational Factors With an Application to Color Images[J].The Computer Journal,2009,52(1):142-152.
[11]SCHWARZ S,SJÖSTRÖM M,OLSSON R.A Weighted Optimization Approach to Time-of-Flight Sensor Fusion[J].IEEE Transactions on Image Processing,2014,23(1):214-225.
[12]CHEN X X,QI C.Single-Image Super-Resolution Via Low-Rank Matrix Recovery and Joint Learning[J].Chinese Journal of Computers,2014,37(6):1372-1379.
[13]KHODDAMI A A,MOALLEM P,KAZEMI M.Depth Map Super Resolution Using Structure-Preserving Guided Filtering[J].IEEE Sensors Journal,2022,22(13):13144-13152.
[14]YOON Y,JEON H G,YOO D,et al.Learning A Deep Convolutional Network for Light-Field Image Super-Resolution[C]//2015 IEEE International Conference on Computer Vision Workshop(ICCVW).2015:57-65.
[15]ZENG K,YU J,WANG R,et al.Coupled Deep Autoencoder for Single Image Super-Resolution[J].IEEE Transactions on Cybernetics,2017,47:27-37.
[16]WEI Y,YUAN Q,SHEN H,et al.Boosting the Accuracy ofMultispectral Image Pansharpening by Learning A Deep Residual Network[J].IEEE Geoscience and Remote Sensing Letters,2017,14(10):1795-1799.
[17]LU B,LING Q.Edge-Guided Depth Image Super-ResolutionBased on KSVD[J].IEEE Access,2020,8:41108-41115.
[18]JAMES D,SEBASTIAN T.An Application of Markov Random Fields to Range Sensing[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems.MIT,2005:291-298.
[19]LIU X,ZHAI D,CHEN R,et al.Depth Super-Resolution ViaJoint Color-Guided Internal and External Regularizations[J].IEEE Transactions on Image Processing,2019,28(4):1636-1645.
[20]WANG H,YANG M,LAN X,et al.Depth Map Recovery Based on A Unified Depth Boundary Distortion Model[J].IEEE Transactions on Image Processing,2022,31:7020-7035.
[21]GUO C,LI C,GUO J,et al.Hierarchical Features Driven Resi-dual Learning for Depth Map Super-Resolution[J].IEEE Transactions on Image Processing,2019,28(5):2545-2557.
[22]LI Y,HUANG J B,AHUJA N,et al.Joint Image Filtering with Deep Convolutional Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(8):1909-1923.
[23]LI Y,HUANG J B,AHUJA N,et al.Deep Joint Image Filtering[C]//European Conference on Computer Vision.2016:154-169.
[24]SU H,JAMPANI V,SUN D,et al.Pixel-Adaptive Convolutio-nal Neural Networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:11158-11167.
[25]HUI T W,LOY C C,TANG X.Depth Map Super-Resolution by Deep Multi-Scale Guidance[C]//European Conference on Computer Vision.2016:353-369.
[26]ZHAO Z,ZHANG J,XU S,et al.Discrete Cosine TransformNetwork for Guided Depth Map Super-Resolution[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:5687-5697.
[27]MIAO Y,ZHANG X,SUN Y,et al.Ca-DBMNet:a channel attention based dual branch multi-scale network for depth map super-resolution[J].Multimedia Tools and Applications,2025,84:1577-1596.
[28]LI J,LIANG Y,LI S,et al.Study on Algorithm of Depth Image Super-Resolution Guided by High-Frequency Information of Color Images[J].Computer Science,2024,51(7):197-205.
[29]WANG Z,YAN Z,YANG J.SGNet:structure guided networkvia gradient-frequency awareness for depth map super-resolution[C]//Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence.AAAI,2024:5823-5831.
[30]WANG Z,YAN Z,YANG M H,et al.Scene Prior Filtering for Depth Map Super-Resolution[J].arXiv:2402.13876,2024.
[31]YUAN J,JIANG H,LI X,et al.Recurrent structure attentionguidance for depth super-resolution[C]//Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence.AAAI,2023:3331-3339.
[32]ZHANG C,WANG L,CHENG S.HCGNet:A Hybrid Change Detection Network Based on CNN and GNN[J].IEEE Transactions on Geoscience and Remote Sensing,2024,62:1-12.
[33]ZHONG Z,LIU X,JIANG J,et al.Deep Attentional GuidedImage Filtering[J].IEEE Transactions on Neural Networks and Learning Systems,2024,35(9):12236-12250.
[34]ZAMIR S W,ARORA A,KHAN S H,et al.Restormer:Effi-cient Transformer for High-Resolution Image Restoration[J].arXiv:2111.09881,2021.
[35]LI K,YU R,WANG Z,et al.Locality Guidance for Improving Vision Transformers on Tiny Datasets[C]//European Confe-rence on Computer Vision.2022:110-127.
[36]ZHOU M,FU X,HUANG J,et al.Effective Pan-SharpeningWith Transformer and Invertible Neural Network[J].IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-15.
[37]WU H,XIAO B,CODELLA N C F,et al.CvT:Introducing Convolutions to Vision Transformers[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).IEEE,2021:22-31.
[38]SEIF G,ANDROUTSOS D.Edge-Based Loss Function for Single Image Super-Resolution[C]//2018 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).2018:1468-1472.
[39]MARCHAND E,UCHIYAMA H,SPINDLER F.Pose Estima-tion for Augmented Reality:A Hands-on Survey[J].IEEE Transactions on Visualization and Computer Graphics,2016,22(12):2633-2651.
[40]LU S,REN X,LIU F.Depth Enhancement Via Low-Rank Matrix Completion[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.2014:3390-3397.
[41]HIRSCHMULLER H,SCHARSTEIN D.Evaluation of CostFunctions for Stereo Matching[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.2007:1-8.
[42]SCHARSTEIN D,PAL C.Learning Conditional Random Fields for Stereo[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.2007:1-8.
[43]KIM B,PONCE J,HAM B.Deformable Kernel Networks forJoint Image Filtering[J].International Journal of Computer Vision,2021,129(2):579-600.
[44]DENG X,DRAGOTTIP L.Deep Convolutional Neural Network for Multi-Modal Image Restoration and Fusion[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(10):3333-3348.
[45]DONG J,PAN J,REN J S,et al.Learning Spatially Variant Li-near Representation Models for Joint Filtering[J].IEEE Tran-sactions on Pattern Analysis and Machine Intelligence,2022,44(11):8355-8370.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!