计算机科学 ›› 2023, Vol. 50 ›› Issue (8): 125-132.doi: 10.11896/jsjkx.220600046

• 计算机图形学&多媒体 • 上一篇    下一篇

融合粗粒度代价体及双边网格的轻量级多视图三维重建

张啸, 董红斌   

  1. 哈尔滨工程大学计算机科学与技术学院 哈尔滨 150001
  • 收稿日期:2022-06-06 修回日期:2022-11-06 出版日期:2023-08-15 发布日期:2023-08-02
  • 通讯作者: 董红斌(donghongbin@hrbeu.edu.cn)
  • 作者简介:(zhangxiao980516@163.com)
  • 基金资助:
    黑龙江省自然科学基金(LH2020F023)

Lightweight Multi-view Stereo Integrating Coarse Cost Volume and Bilateral Grid

ZHANG Xiao, DONG Hongbin   

  1. College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China
  • Received:2022-06-06 Revised:2022-11-06 Online:2023-08-15 Published:2023-08-02
  • About author:ZHANG Xiao,born in 1998,postgra-duate.His main research interests include deep learning and 3D reconstruction.
    DONG Hongbin,born in 1963,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include evolutionary computation,machine learning and multi-agent system.
  • Supported by:
    Natural Science Foundation of Heilongjiang Province,China(LH2020F023).

摘要: 针对基于深度学习的多视图立体(Multi-view Stereo,MVS)重建算法内存消耗过大、推理速度慢,以及对病态区域重建效果不佳的问题,提出了一种基于双边网格和融合代价体的轻量级级联的MVS重建网络。首先利用基于双边网格的代价体上采样模块将较低分辨率代价体高效地恢复成高分辨率代价体。随着采用轻量级的动态区域卷积和粗粒度代价体融合模块,提升网络对病态区域特征的表示能力以及对场景整体信息和结构信息的感知能力。实验结果表明,该网络在DTU数据集以及Tanks and Temples数据集上均取得了具有竞争性的结果,并且在内存消耗以及推理速度上都显著优于其他方法。

关键词: 三维重建, 多视图立体, 深度学习, 双边网格, 轻量级

Abstract: In order to tackle the problems of large memory consumption,poor real-time performance and poor reconstruction quality for low-textured areas of multi-view stereo reconstruction algorithm basedon deep learning,this paper proposes a lightweight cascade MVS reconstruction network based on bilateral grid and fused cost volume.Firstly,it builds the cost volume upsampling module based on learned bilateral grid,which can efficiently restore the low-resolution cost volume to the high-resolution cost volume.Then the dynamic region convolution and coarse cost volume fusion module are used to improve the network's ability to extract the feature of the challenging area and to perceive the global and structural information of the scene.Experimental results show that our method achieves competitive results on DTU dataset and tanks and temples benchmark,and is significantly better than other methods in memory consumption and inference speed.

Key words: 3D reconstruction, Multi-view stereo, Deep learning, Bilateral grid, Lightweight

中图分类号: 

  • TP391
[1]ZHENG T X,HUANG S,LI Y F,et al.Key Techniques for Vision Based 3D Reconstruction:a Review[J].Journal of Automation,2020,46(4):631-652.
[2]HE Y,YANG J,HOU X,et al.ICP registration with DCA descriptor for 3D point clouds[J].Optics Express,2021,29(13):20423-20439.
[3]WANG X,WANG C,LIU B,et al.Multi-view stereo in the Deep Learning Era:A comprehensive review[J].Displays,2021,70:102102.
[4]XU Q,TAO W.Multi-scale geometric consistency guided multi-view stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2019:5483-5492.
[5]CHEN K,LIU X G.Global Optimized Multi-view 3D Recon-struction Method Based on Rays[J].Computer Engineering,2013,39(11):235-239.
[6]YANG J,MA W,ALVAREZ J M,et al.Cost volume pyramid based depth inference for multi-view stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:4877-4886.
[7]LUO K,GUAN T,JU L,et al.Attention-aware multi-view ste-reo[C]//Proceedings of the IEEE/CVF Conference onCompu-ter Vision and Pattern Recognition.New York:IEEE Press,2020:1590-1599.
[8]LIU H J,BAI Z Y,CHENG W,et al.Fusion attention mechanism and multilayer U-Net for multiview stereo[J].Chinese Journal of Image and Graphics,2022,27(2):475-485.
[9]CHEN J,WANG X,GUO Z,et al.Dynamic region-aware convo-lution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2021:8064-8073.
[10]YAN J,WEI Z,YI H,et al.Dense hybrid recurrent multi-view stereo net with dynamic consistency checking[C]//European Conference on Computer Vision.Cham:Springer,2020:674-689.
[11]XU B,XU Y,YANG X,et al.Bilateral grid learning for stereo matching networks[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2021:12497-12506.
[12]KUTULAKOS K N,SEITZ S M.A theory of shape by space carving[J].International Journal of Computer Vision,2000,38(3):199-218.
[13]LHUILLIER M,QUAN L.A quasi-dense approach to surface reconstruction from uncalibrated images[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(3):418-433.
[14]FURUKAWA Y,PONCE J.Accurate,dense,and robust multi-view stereopsis[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,32(8):1362-1376.
[15]GALLIANI S,LASINGER K,SCHINDLER K.Massively pa-rallel multiview stereopsis by surface normal diffusion[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:873-881.
[16]SCHONBERGER J L,ZHENG E,FRAHM J M,et al.Pixelwise view selection for unstructured multi-view stereo[C]//European Conference on Computer Vision.Cham:Springer,2016:501-518.
[17]YAO Y,LUO Z,LI S,et al.Mvsnet:Depth inference for unstructured multi-view stereo[C]//Proceedings of the European Conference on Computer Vision.Berlin:Springer,2018:767-783.
[18]GALLUP D,FRAHM J M,MORDOHAI P,et al.Real-timeplane-sweeping stereo with multiple sweeping directions[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2007:1-8.
[19]CHEN R,HAN S,XU J,et al.Point-based multi-view stereonetwork[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Press,2019:1538-1547.
[20]YAO Y,LUO Z,LI S,et al.Recurrent mvsnet for high-resolution multi-view stereo depth inference[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2019:5525-5534.
[21]YU Z,GAO S.Fast-mvsnet:Sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:1949-1958.
[22]GU X,FAN Z,ZHU S,et al.Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:2495-2504.
[23]CHEN R,HAN S,XU J,et al.Visibility-aware point-basedmulti-view stereo network[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(10):3695-3708.
[24]CHENG S,XU Z,ZHU S,et al.Deep stereo using ADAPTIVE thin volume representation with uncertainty awareness[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:2524-2534.
[25]WEI Z,ZHU Q,MIN C,et al.Aa-rmvsnet:Adaptive aggregation recurrent multi-view stereo network[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:6187-6196.
[26]MA X,GONG Y,WANG Q,et al.EPP-MVSNet:Epipolar-Assembling Based Depth Prediction for Multi-View Stereo[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:5732-5740.
[27]CHEN J,PARIS S,DURAND F.Real-time edge-aware imageprocessing with the bilateral grid[J].ACM Transactions on Graphics(TOG),2007,26(3):103-112.
[28]BARRON J T,ADAMS A,SHIH Y C,et al.Fast bilateral-space stereo for synthetic defocus[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:4466-4474.
[29]GHARBI M,CHEN J,BARRON J T,et al.Deep bilateral lear-ning for real-time image enhancement[J].ACM Transactions on Graphics(TOG),2017,36(4):1-12.
[30]XU B,XU Y,YANG X,et al.Bilateral grid learning for stereo matching networks[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.New York:IEEE Press,2021:12497-12506.
[31]AANES H,JENSEN R,VOGIATZIS G,et al.Large-scale data for multiple-view stereopsis[J].International Journal of Computer Vision,2016,120(2):153-168.
[32]KNAPITSCH A,PARK J,ZHOU Q Y,et al.Tanks and temples:Benchmarking large-scale scene reconstruction[J].ACM Transactions on Graphics(ToG),2017,36(4):1-13.
[33]YU Z,GAO S.Fast-mvsnet:Sparse-to-dense MULTI-view ste-reo with learned propagation and gauss-newton refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:1949-1958.
[34]JI M,GALL J,ZHENG H,et al.Surfacenet:An end-to-end 3d neural network for multiview stereopsis[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2017:2307-2315.
[35]XU Q,TAO W.Learning inverse depth regression for multi-view stereo with correlation cost volume[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Palo Alto:AAAI Press,2020,34(7):12508-12515.
[36]YI H,WEI Z,DING M,et al.Pyramid multi-view stereo netwith self-adaptive view aggregation[C]//European Conference on Computer Vision.Cham:Springer,2020:766-782.
[37]WEI Z,ZHU Q,MIN C,et al.Bidirectional Hybrid LSTM Based Recurrent Neural Network for Multi-view Stereo[J].IEEE Transactions on Visualization and Computer Graphics,2022,29(1):1-12.
[38]LUO K,GUAN T,JU L,et al.P-mvsnet:Learning patch-wise matching confidence aggregation for multi-view stereo[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2019:10452-10461.
[39]MOULON P,MONASSE P,PERROT R,et al.Openmvg:Open multiple view geometry[C]//International Workshop on Reproducible Research in Pattern Recognition.Cham:Springer,2016:60-74.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!