Computer Science ›› 2023, Vol. 50 ›› Issue (8): 125-132.doi: 10.11896/jsjkx.220600046

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Lightweight Multi-view Stereo Integrating Coarse Cost Volume and Bilateral Grid

ZHANG Xiao, DONG Hongbin   

  1. College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China
  • Received:2022-06-06 Revised:2022-11-06 Online:2023-08-15 Published:2023-08-02
  • About author:ZHANG Xiao,born in 1998,postgra-duate.His main research interests include deep learning and 3D reconstruction.
    DONG Hongbin,born in 1963,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include evolutionary computation,machine learning and multi-agent system.
  • Supported by:
    Natural Science Foundation of Heilongjiang Province,China(LH2020F023).

Abstract: In order to tackle the problems of large memory consumption,poor real-time performance and poor reconstruction quality for low-textured areas of multi-view stereo reconstruction algorithm basedon deep learning,this paper proposes a lightweight cascade MVS reconstruction network based on bilateral grid and fused cost volume.Firstly,it builds the cost volume upsampling module based on learned bilateral grid,which can efficiently restore the low-resolution cost volume to the high-resolution cost volume.Then the dynamic region convolution and coarse cost volume fusion module are used to improve the network's ability to extract the feature of the challenging area and to perceive the global and structural information of the scene.Experimental results show that our method achieves competitive results on DTU dataset and tanks and temples benchmark,and is significantly better than other methods in memory consumption and inference speed.

Key words: 3D reconstruction, Multi-view stereo, Deep learning, Bilateral grid, Lightweight

CLC Number: 

  • TP391
[1]ZHENG T X,HUANG S,LI Y F,et al.Key Techniques for Vision Based 3D Reconstruction:a Review[J].Journal of Automation,2020,46(4):631-652.
[2]HE Y,YANG J,HOU X,et al.ICP registration with DCA descriptor for 3D point clouds[J].Optics Express,2021,29(13):20423-20439.
[3]WANG X,WANG C,LIU B,et al.Multi-view stereo in the Deep Learning Era:A comprehensive review[J].Displays,2021,70:102102.
[4]XU Q,TAO W.Multi-scale geometric consistency guided multi-view stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2019:5483-5492.
[5]CHEN K,LIU X G.Global Optimized Multi-view 3D Recon-struction Method Based on Rays[J].Computer Engineering,2013,39(11):235-239.
[6]YANG J,MA W,ALVAREZ J M,et al.Cost volume pyramid based depth inference for multi-view stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:4877-4886.
[7]LUO K,GUAN T,JU L,et al.Attention-aware multi-view ste-reo[C]//Proceedings of the IEEE/CVF Conference onCompu-ter Vision and Pattern Recognition.New York:IEEE Press,2020:1590-1599.
[8]LIU H J,BAI Z Y,CHENG W,et al.Fusion attention mechanism and multilayer U-Net for multiview stereo[J].Chinese Journal of Image and Graphics,2022,27(2):475-485.
[9]CHEN J,WANG X,GUO Z,et al.Dynamic region-aware convo-lution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2021:8064-8073.
[10]YAN J,WEI Z,YI H,et al.Dense hybrid recurrent multi-view stereo net with dynamic consistency checking[C]//European Conference on Computer Vision.Cham:Springer,2020:674-689.
[11]XU B,XU Y,YANG X,et al.Bilateral grid learning for stereo matching networks[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2021:12497-12506.
[12]KUTULAKOS K N,SEITZ S M.A theory of shape by space carving[J].International Journal of Computer Vision,2000,38(3):199-218.
[13]LHUILLIER M,QUAN L.A quasi-dense approach to surface reconstruction from uncalibrated images[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(3):418-433.
[14]FURUKAWA Y,PONCE J.Accurate,dense,and robust multi-view stereopsis[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,32(8):1362-1376.
[15]GALLIANI S,LASINGER K,SCHINDLER K.Massively pa-rallel multiview stereopsis by surface normal diffusion[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:873-881.
[16]SCHONBERGER J L,ZHENG E,FRAHM J M,et al.Pixelwise view selection for unstructured multi-view stereo[C]//European Conference on Computer Vision.Cham:Springer,2016:501-518.
[17]YAO Y,LUO Z,LI S,et al.Mvsnet:Depth inference for unstructured multi-view stereo[C]//Proceedings of the European Conference on Computer Vision.Berlin:Springer,2018:767-783.
[18]GALLUP D,FRAHM J M,MORDOHAI P,et al.Real-timeplane-sweeping stereo with multiple sweeping directions[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2007:1-8.
[19]CHEN R,HAN S,XU J,et al.Point-based multi-view stereonetwork[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Los Alamitos:IEEE Press,2019:1538-1547.
[20]YAO Y,LUO Z,LI S,et al.Recurrent mvsnet for high-resolution multi-view stereo depth inference[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Press,2019:5525-5534.
[21]YU Z,GAO S.Fast-mvsnet:Sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:1949-1958.
[22]GU X,FAN Z,ZHU S,et al.Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:2495-2504.
[23]CHEN R,HAN S,XU J,et al.Visibility-aware point-basedmulti-view stereo network[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(10):3695-3708.
[24]CHENG S,XU Z,ZHU S,et al.Deep stereo using ADAPTIVE thin volume representation with uncertainty awareness[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:2524-2534.
[25]WEI Z,ZHU Q,MIN C,et al.Aa-rmvsnet:Adaptive aggregation recurrent multi-view stereo network[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:6187-6196.
[26]MA X,GONG Y,WANG Q,et al.EPP-MVSNet:Epipolar-Assembling Based Depth Prediction for Multi-View Stereo[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2021:5732-5740.
[27]CHEN J,PARIS S,DURAND F.Real-time edge-aware imageprocessing with the bilateral grid[J].ACM Transactions on Graphics(TOG),2007,26(3):103-112.
[28]BARRON J T,ADAMS A,SHIH Y C,et al.Fast bilateral-space stereo for synthetic defocus[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2015:4466-4474.
[29]GHARBI M,CHEN J,BARRON J T,et al.Deep bilateral lear-ning for real-time image enhancement[J].ACM Transactions on Graphics(TOG),2017,36(4):1-12.
[30]XU B,XU Y,YANG X,et al.Bilateral grid learning for stereo matching networks[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.New York:IEEE Press,2021:12497-12506.
[31]AANES H,JENSEN R,VOGIATZIS G,et al.Large-scale data for multiple-view stereopsis[J].International Journal of Computer Vision,2016,120(2):153-168.
[32]KNAPITSCH A,PARK J,ZHOU Q Y,et al.Tanks and temples:Benchmarking large-scale scene reconstruction[J].ACM Transactions on Graphics(ToG),2017,36(4):1-13.
[33]YU Z,GAO S.Fast-mvsnet:Sparse-to-dense MULTI-view ste-reo with learned propagation and gauss-newton refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:1949-1958.
[34]JI M,GALL J,ZHENG H,et al.Surfacenet:An end-to-end 3d neural network for multiview stereopsis[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2017:2307-2315.
[35]XU Q,TAO W.Learning inverse depth regression for multi-view stereo with correlation cost volume[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Palo Alto:AAAI Press,2020,34(7):12508-12515.
[36]YI H,WEI Z,DING M,et al.Pyramid multi-view stereo netwith self-adaptive view aggregation[C]//European Conference on Computer Vision.Cham:Springer,2020:766-782.
[37]WEI Z,ZHU Q,MIN C,et al.Bidirectional Hybrid LSTM Based Recurrent Neural Network for Multi-view Stereo[J].IEEE Transactions on Visualization and Computer Graphics,2022,29(1):1-12.
[38]LUO K,GUAN T,JU L,et al.P-mvsnet:Learning patch-wise matching confidence aggregation for multi-view stereo[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2019:10452-10461.
[39]MOULON P,MONASSE P,PERROT R,et al.Openmvg:Open multiple view geometry[C]//International Workshop on Reproducible Research in Pattern Recognition.Cham:Springer,2016:60-74.
[1] WANG Yu, WANG Zuchao, PAN Rui. Survey of DGA Domain Name Detection Based on Character Feature [J]. Computer Science, 2023, 50(8): 251-259.
[2] ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[3] SONG Xinyang, YAN Zhiyuan, SUN Muyi, DAI Linlin, LI Qi, SUN Zhenan. Review of Talking Face Generation [J]. Computer Science, 2023, 50(8): 68-78.
[4] WANG Xu, WU Yanxia, ZHANG Xue, HONG Ruize, LI Guangsheng. Survey of Rotating Object Detection Research in Computer Vision [J]. Computer Science, 2023, 50(8): 79-92.
[5] ZHOU Ziyi, XIONG Hailing. Image Captioning Optimization Strategy Based on Deep Learning [J]. Computer Science, 2023, 50(8): 99-110.
[6] LI Kun, GUO Wei, ZHANG Fan, DU Jiayu, YANG Meiyue. Adversarial Malware Generation Method Based on Genetic Algorithm [J]. Computer Science, 2023, 50(7): 325-331.
[7] WANG Mingxia, XIONG Yun. Disease Diagnosis Prediction Algorithm Based on Contrastive Learning [J]. Computer Science, 2023, 50(7): 46-52.
[8] SHEN Zhehui, WANG Kailai, KONG Xiangjie. Exploring Station Spatio-Temporal Mobility Pattern:A Short and Long-term Traffic Prediction Framework [J]. Computer Science, 2023, 50(7): 98-106.
[9] HUO Weile, JING Tao, REN Shuang. Review of 3D Object Detection for Autonomous Driving [J]. Computer Science, 2023, 50(7): 107-118.
[10] ZHOU Bo, JIANG Peifeng, DUAN Chang, LUO Yuetong. Study on Single Background Object Detection Oriented Improved-RetinaNet Model and Its Application [J]. Computer Science, 2023, 50(7): 137-142.
[11] MAO Huihui, ZHAO Xiaole, DU Shengdong, TENG Fei, LI Tianrui. Short-term Subway Passenger Flow Forecasting Based on Graphical Embedding of Temporal Knowledge [J]. Computer Science, 2023, 50(7): 213-220.
[12] LI Yuqiang, LI Linfeng, ZHU Hao, HOU Mengshu. Deep Learning-based Algorithm for Active IPv6 Address Prediction [J]. Computer Science, 2023, 50(7): 261-269.
[13] HAN Junling, LI Bo, KANG Xiaodong, YANG Jingyi, LIU Hanqing, WANG Xiaotian. Cardiac MRI Image Segmentation Based on Faster R-CNN and U-net [J]. Computer Science, 2023, 50(6A): 220600047-9.
[14] LIU Haowei, YAO Jingchi, LIU Bo, BI Xiuli, XIAO Bin. Two-stage Method for Restoration of Heritage Images Based on Muti-scale Attention Mechanism [J]. Computer Science, 2023, 50(6A): 220600129-8.
[15] XIE Puxuan, CUI Jinrong, ZHAO Min. Electiric Bike Helment Wearing Detection Alogrithm Based on Improved YOLOv5 [J]. Computer Science, 2023, 50(6A): 220500005-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!