计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240300045-8.doi: 10.11896/jsjkx.240300045
何维龙1, 苏玲莉1, 郭丙轩2, 李茂森3, 郝岩1
HE Weilong1, SU Lingli1, GUO Bingxuan2, LI Maosen3, HAO Yan1
摘要: 双目立体视觉技术在计算机视觉领域研究中一直具有重要意义。不同于单目或多目技术,双目立体视觉在能够准确获取图像深度的同时,也兼具了低成本、高泛用性、使用简便等优势。基于双目视觉的三维感知技术能够极大提升计算机对现实世界的理解和交互能力,进一步增强计算机视觉技术在复杂、多变的场景中的适应能力,在自动驾驶、机器人导航、工业检测、航天等领域发挥着重要作用。文中重点研究动态场景中的三维重建与目标感知技术,在大多数情况中,视野中的动态目标实际上是需要重点关注的目标,而静态目标,特别是在场景中绝大多数时候都占据主要空间的背景以及静态物体往往是可以被忽略掉的,但是在实际计算时确占用了大量资源。在场景中不受关注的目标上花费过多计算资源,显然是无意义且非常低效的。针对这个问题,本文在深入研究了目前主流的双目立体匹配方法、图像分割等方法的基础上,提出了一种基于双目估计的动态场景三维感知技术。主要的创新点和研究成果包括:针对传统双目立体匹配算法逐像素计算聚合低价效率低下的问题,提出了一种基于二维场景实例分割的双目立体匹配方法,使用mask分割后的目标图像进行立体匹配,这样不仅提升了匹配性能,同时也降低了动态目标的匹配难度。针对分割精确不足的问题,引入基于RGB图像的mask边缘滤波优化方法,在提升效率的同时提升视场点云重建精度。其次,基于双目估计深度学习网络进行实时目标点云生产,并提出基于GPU加速的邻近帧点云的实时动态目标感知算法。最后提出二三维一体的动态目标实时感知技术,在对目标场景实现实时三维重建的同时,快速识别检测环境中的动态目标物体。
中图分类号:
[1]FANG L P,HE H J,ZHOU G M.A Review of Object Detection Algorithm Research [J].Computer Engineering and Applications,2018,54(13):11-18,33. [2]WU Q,WANG T,WANG H W,et al.A Review of Modern Intelligent Video Surveillance Research [J].Computer Application Research,2016,33(6):1601-1606. [3]ZHANG G Y,XIANG H,ZHAO Y.A review of research oncomputer vision based autonomous driving algorithms [J].Journal of Guizhou Normal University,2016,32(6):1674-7798. [4]LI Q H,LONG X F,NONG Z L,et al.Clinical Application of Digital Medicine 3D Reconstruction Technology in Closed Abdominal Injury in Children [J].Chinese and Foreign Medical Research,2021,19(3):191-193. [5] MARR D,POGGIO T.A computational theory of human stereo vision[J].Proceedings of the Royal Society of London.Series B.Biological Sciences,1979,204(1156):301-328. [6]BARNARD S T,FISCHLER M A.Computational stereo[J].ACM Computing Surveys(CSUR),1982,14(4):553-572. [7]ZHANG Y W,HU K,WANG P S.A Review of 3D Reconstruction Algorithm Research [J].Nanjing Information Technology Journal of Cheng University(Natural Science Edition),2020,12(5):75-83. [8]YOON K J,KWEON I S.Adaptive support-weight approach for correspondence search[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(4):650-656. [9]GERRITS M,BEKAERT P.Local stereo matching with seg-mentation-based outlier rejection[C]//The 3rd Canadian Conference on Computer and Robot Vision (CRV'06).IEEE,2006:66-66. [10]LASKOWSKI Ł.A novel hybrid-maximum neural network instereo-matching process[J].Neural Computing and Applications,2013,23(7):2435-2450. [11]ZBONTAR J,LECUN Y.Stereo matching by training a convolutional neural network to compare image patches[J].J.Mach.Learn.Res.,2016,17(1):2287-2318. [12]MAYER N,ILG E,HAUSSER P,et al.A large dataset to train convolutional networks for disparity,optical flow,and scene flow estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:4040-4048. [13]PANG J,SUN W,REN J S J,et al.Cascade residual learning:A two-stage convolutional neural network for stereo matching[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.2017:887-895. [14]KENDALL A,MARTIROSYAN H,DASGUPTA S,et al.End-to-end learning of geometry and context for deep stereo regression[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:66-75. [15]CHANG J R,CHEN Y S.Pyramid stereo matching network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5410-5418. [16]TONIONI A,TOSI F,POGGI M,et al.Real-time self-adaptive deep stereo[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:195-204. [17]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Image Net classification with deep convolutional neural networks[C]// International Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1097-1105. [18]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [19]HUANG G,LIU Z,WEINBERGER K Q,et al.Densely con-nected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017. [20]EIGEN D,PUHRSCH C,FERGUS R.Depth map predictionfrom a single image using a multi-scale deep network[C]//Advances in Neural Information Processing Systems.2014:2366-2374. [21]EIGEN D,FERGUS R.Predicting depth,surface normals and semantic labels with a common multi-scale convolutional architecture[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:2650-2658. [22]SHELHAMER E,BARRON J T,DARRELL T.Scene intrinsics and depth from a single image[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops.2015:37-44. [23]FU H,GONG M,WANG C,et al.Deep ordinal regression network for monocular depth estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2002-2011. [24]WOFK D,MA F,YANG T J,et al.Fastdepth:Fast monocular depth estimation on embedded systems[C]//2019 International Conference on Robotics and Automation (ICRA).IEEE,2019:6101-6108. [25]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440. [26]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890. [27]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected crfs[J].arXiv:1412.7062,2014. [28]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848. [29]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017. [30]CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:801-818. |
|