计算机科学 ›› 2025, Vol. 52 ›› Issue (3): 231-238.doi: 10.11896/jsjkx.231200111
陈光远, 王朝辉, 程泽
CHEN Guangyuan, WANG Zhaohui, CHENG Ze
摘要: 针对基于深度学习的多视图立体(Multi-view Stereo,MVS)重建算法仍然存在图像特征提取不全面、代价体匹配模糊以及深度误差不断积累而导致在无纹理和重复纹理区域重建效果差的问题,提出了基于上下文引导的代价体构建和深度细化的级联MVS网络。首先,利用基于无参注意力的特征融合模块过滤无用特征并通过特征融合来解决多尺度特征不一致的问题;然后,利用基于上下文引导的代价体模块融合全局信息来提高代价体匹配的完整性和鲁棒性;最后,利用深度细化模块学习深度残差来提升低分辨下深度图的准确性。实验结果表明,在DTU数据集上,该网络相比MVSNet完整度误差减小了24.4%,准确度误差减小了4.1%,整体误差减小了14.3%,其在Tanks and Temples数据集上性能也优于大多数算法,展现出强大的竞争力。
中图分类号:
[1]WANG X,WANG C,LIU B,et al.Multi-view stereo in the deeplearning era:A comprehensive review[J].Displays,2021,70:102102. [2]FURUKAWA Y,HERNANDEZ C.Multi-view stereo:A tuto-rial[J].Foundations and Trends© in Computer Graphics and Vision,2015,9(1/2):1-148. [3]GU J,WANG Z,KUEN J,et al.Recent advances in convolu-tional neural networks[J].Pattern Recognition,2018,77:354-377. [4]YAO Y,LUO Z,LI S,et al.Mvsnet:Depth inference for unstructured multi-view stereo[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:767-783. [5]GU X,FAN Z,ZHU S,et al.Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]//Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition.2020:2495-2504. [6]YANG J,MAO W,ALVAREZ J M,et al.Cost volume pyramid based depth inference for multi-viewstereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:4877-4886. [7]CHENG S,XU Z,ZHU S,et al.Deep stereo using adaptive thin volume representation with uncertainty awareness[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:2524-2534. [8]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature p yramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125. [9]HARIS M,SHAKHNAROVICH G,UKITA N.Deep back-pro-jection networks for super-resolution[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:16641673. [10]SINHA S N,MORDOHAI P,POLLEFEYS M.Multi-view stereo via graph cuts on the dual of an adaptive tetrahedral mesh[C]//2007 IEEE 11th International Conference on Computer Vision.IEEE,2007:1-8. [11]FURUKAWA Y,PONCE J.Carved visual hulls for image-based modeling[C]//Computer Vision-ECCV 2006:9th European Conference on Computer Vision,Graz,Austria,May 7-13,2006.Proceedings,Part I 9.Springer Berlin Heidelberg,2006:564-577. [12]SCHONBERGER J L,FRAHM J M.Structure-from-motion revisited[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:4104-4113. [13]GALLIANI S,LASINGER K,SCHINDLER K.Massively pa-rallel multiview stereopsis by surface normal diffusion[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:873-881. [14]CAMPBELL N D F,VOGIATZIS G,HERNANDEZ C,et al.Using multiple hypotheses to improve depth- maps for multi-view stereo[C]//Computer Vision ECCV 2008:10th European Conference on Computer Vision,Marseille,France,October 12-18,2008,Proceedings,Part I 10.Springer Berlin Heidelberg,2008:766-779. [15]TOLA E,STRECHA C,FUA P.Efficient large-scalemulti-view stereo for ultra high-resolution image sets[J].Machine Vision and Applications,2012,23:903-920. [16]KANG S B,SZELISKI R,CHAI J.Handling occlusions in dense multi-view stereo[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.CVPR 2001.IEEE,2001. [17]JI M,GALL J,ZHENG H,et al.Surfacenet:An end-to-end 3d neural network for multiview stereopsis[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2307-2315. [18]YAO Y,LUO Z,LI S,et al.Recurrent mvsnet for high-resolution multi-view stereo depth inference[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and PaRtern recognition.2019:5525-5534. [19]YU Z,GAO S.Fast-mvsnet:Sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:1949-1958. [20]WANG F,GALLIANI S,VOGEL C,et al.Patchmatchnet:Learned multi-view patchmatch stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:14194-14203. [21]PENG R,WANG R,WANG Z,et al.Rethinking depth estimation for multi-view stereo:A unified representation[C]//Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition.2022:8645-8654. [22]MI Z,DI C,XU D.Generalized binary search network for highly-efficient multi-view stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:12991-13000. [23]CAO C,REN X,FU Y.Mvsformer:Learning robust image re-presentations via transformers and temperature-based depth for multi-view stereo[J].arXiv:2208.02541,2022. [24]Ding Y,YUAN W,Zhu Q,et al.Transmvsnet:Globalcontext-aware multi-view stereo network withtransformers[C]//Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition.2022:8585-8594. [25]MA X,GONG Y,WANG Q,et al.Epp-mvsnet:Epipolar assembling based depth prediction for multi-view stereo[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2021:5732-5740. [26]LUO A,YANG F,LI X,et al.Learning optical flow with kernel patch attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:8906-8915. [27]YANG L,ZHANG R Y,LI L,et al.Simam:A simple,parameter-free attention module for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2021:11863-11874. [28]AANæS H,JENSEN R R,VOGIATZIS G,et al.Large-scale data for multiple-view stereopsis[J].International Journal of Computer Vision,2016,120:153-168. [29]KNAPITSCH A,PARK J,ZHOU Q Y,et al.Tanks and tem-ples:Benchmarking large-scale scene reconstruction[J].ACM Transactions on Graphics(ToG),2017,36(4):1-13. [30]CAMPBELL N D F,VOGIATZIS G,HERNANDEZ C,et al.Using multiple hypotheses to improve depth-maps for multi-view stereo[C]//Computer Vision ECCV 2008:10th European Conference on Computer Vision,Marseille,France,October 1218,2008,Proceedings,Part I 10.Springer Berlin Heidelberg,2008:766-779. [31]GALLIANI S,LASINGER K,SCHINDLER K.Gipuma:Mas-sively parallel multi-view stereo reconstruction[J/OL].https://www.dgpf.de/src/tagung/jt2016/proceedings/papers/34_DLT2016_Galliani_et_al.pdf. [32]SCHONBERGER J L,FRAHM J M.Structure-from-motion revisited[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:4104-4113. [33]CHEN R,HAN S,XU J,et al.Point-based multi-view stereonetwork[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1538-1547. [34]WEI Z,ZHU Q,MIN C,et al.Aa-rmvsnet:Adaptive aggregation recurrent multi-view stereo network[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:6187-6196. [35]YI P,TANG S,YAO J.DDR-Net:Learning multi-stage multi-view stereo with dynamic depth range[J].arXiv:2103.14275,2021. |
|