Computer Science ›› 2021, Vol. 48 ›› Issue (9): 216-222.doi: 10.11896/jsjkx.200800203

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Real-time Binocular Depth Estimation Algorithm Based on Semantic Edge Drive

ZHANG Peng, WANG Xin-qing, XIAO Yi, DUAN Bao-guo, XU Hong-hui   

  1. Department of Mechanical Engineering,College of Field Engineering,Army Engineering University,Nanjing 210007,China
  • Received:2020-08-29 Revised:2020-09-08 Online:2021-09-15 Published:2021-09-10
  • About author:ZHANG Peng,born in 1995,postgra-duate.His main research interests include deep learning,computer vision and point cloud processing.
    WANG Xin-qing,born in 1963,Ph.D,professor,Ph.D supervisor.His main research interests include intelligent signal processing and deep learning.
  • Supported by:
    National Natural Science Foundation of China(61671470),National Basic Research Program of China(2016YFC0802904) and China Postdoctoral Science Foundation (2017M623423)

Abstract: Aiming at the problem of ill-posed regions with blurred disparity edges,unsmooth disparity,discontinuous disparity of a single object,and holes in stereo matching,a lightweight real-time binocular depth estimation algorithm is proposed,which uses the semantic tags obtained by semantic segmentation of the scene graph and the edge detail images obtained by edge detection asauxi-liary loss,and the ground truth image as the main loss,to construct the joint loss function which can better supervise the generation of the disparity map.In addition,a lightweight feature extraction module is constructed to reduce the redundancy of the feature extraction stage,which can better simplify the feature extraction steps,and improve the real-time and lightness of the network.Finally,the idea of from coarse to fine is used to realize the gradual refinement process of the disparity map with fusion of low-resolution disparity map deformation and high-resolution feature map to generate disparity maps of different scales in stages,meanwhile,the detailed features are gradually enriched,thus obtaining the final accurate disparity map.The 3px error rate of 1.72% is obtained on the KITTI 2012 dataset,the Vintge error rate on the Middlebury 2014 dataset is 1.23%,the Playroom error rate is 2.23%,and the Recycle error rate is 1.65%.Meanwhile,the calculation time on the Scene Flow dataset reaches 0.76 s with 2.4 G memory occupation,which significantly improves the accuracy and computational efficiency of stereo matching algorithms in the ill-posed regions,meets the real-time requirements in engineering practice,and has important guiding significance for real-time 3D reconstruction tasks.

Key words: Edge extraction, End-to-end network, From coarse to fine, Semantic understanding, Stereo matching

CLC Number: 

  • TP391.41
[1]ZHAO X,LIU L,ZHENG R,et al.A robust stereo feature-aided semi-direct SLAM system[J].Robotics and Autonomous Systems,2020,132(5):103597.
[2]SCHARSTEIN D,SZELISKI R.A Taxonomy and evaluation of dense two-Frame stereo correspondence algorithms[J].International Joural of Computer Vision,2018,47(3):7-42.
[3]ZBONTAR J,LECUN Y.Stereo matching by training a convolutional neural network to compare image patches[J].arXiv:1510.05970,2016.
[4]LUO W,ALEXANDER G,RAQUEL U.Efficient deep learning for stereo matching[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.2016:5695-5703.
[5]LEE Y,KYUNG C.A memory and accuracy aware gaussian parameter-based stereo matching using confidence measure[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,2019,99(2):1.
[6]MAYER N,ILG E,HAUSSER P.A large dataset to train con-volutional networks for disparity,optical flow,and scene flow estimation [C]//Proceedings of the IEEE International Confe-rence on Computer Vision and Pattern Recognition.2016:4040-4048.
[7]WU Z,WU X,ZHANG X,et al.Semantic stereo matching with pyramid cost volumes [C]//Proceedings of the IEEE International Conference on Computer Vision.2019:7483-7492.
[8]SONG X,ZHAO X,HU H,et al.EdgeStereo:a context inte-grated residual pyramid network for stereo matching[J].arXiv:1803.05196,2018.
[9]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.2016:6230-6239.
[10]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEETran-sactions on Pattern Analysis & Machine Intelligence,2014,37(9):1904-1916.
[11]XU H,ZHANG J.AANet:adaptive aggregation network for efficient stereo matching[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.2020:1-11.
[12]GU X,FAN Z,DAI Z,et al.Cascade cost volume for high-resolution multi-view stereo and stereo matching[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.2020:43-57.
[13]WU Z,SHI G,CHEN Y,et al.Coarse-to-fine classification for diabetic retinopathy grading using convolutional neural network[J].Artificial Intelligence in Medicine,2020,108(21):101936.
[14]CHANG J,CHEN Y.Pyramid stereo matching network[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.2018:5410-5418.
[15]DOVESI P L,POGGI M,ANDRAGHETTI L.Real-time se-mantic stereo matching[J].Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,2018:5410-5418.
[16]KANG J H,CHEN L,DENG F,et al.Context pyramidal network for stereo matching regularized by disparity gradients[J].ISPRS Journal of Photogrammetry and Remote Sensing,2019,157(5):201-215.
[17]GONG W,QIN L,REN GF,et al.Binocular stereo matching algorithm based on multidimensional feature fusion[J].Laser & Optoelectronics Progress,2020,57(6):1-8.
[18]CAO Y,ZHAO T,XIAN K,et al.Monocular depth estimation with augmented ordinal depth relationships[J].IEEE Transactions on Circults and Systems for Video Technology,2019,30(8):2674-2682.
[19]CHEN L C,PAPANDREOU G,KOKKINOS I.DeepLab:se-mantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848.
[20]YANG M K,YU K,ZHANG C,et al.DenseASPP for semantic segmentation in street scenes[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.2018:3684-3692.
[21]RAMIREZ P Z,POHHI M,TOSI F,et al.Geometry meets semantics for semi-supervised monocular depth estimation[C]//Proceedings of Asian Conference on Computer Vision.2018:298-313.
[22]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.2012:3354-3361.
[23]SCHARSTEIN D,H H Y K.High-resolution stereo datasetswith subpixel-accurate ground truth[C]//Proceedings of the 36th German Conference.2014:31-42.
[24]PASZKE A,GROSS S,MASSA F,et al.PyTorch:An Imperative Style,High-performance deep learning library[J].arXiv:1912.01703,2019.
[25]ZHANG F,PRISACARIU V,YANG R.GA-Net:guided aggregation net for end-to-end stereo matching[J].arXiv:1904.06587,2019.
[26]DU X,E1-KHAMY M V,LEE J.AMNet:deep atrous multiscale stereo disparity estimation networks[J].arXiv:1904.09099,2019.
[1] HANG Ting-ting, FENG Jun, LU Jia-min. Knowledge Graph Construction Techniques:Taxonomy,Survey and Future Directions [J]. Computer Science, 2021, 48(2): 175-189.
[2] CAO Lin, YU Wei-wei. Adaptive Window Binocular Stereo Matching Algorithm Based on Image Segmentation [J]. Computer Science, 2021, 48(11A): 314-318.
[3] SANG Miao-miao, PENG Jin-xian, DA Tong-hang, ZHANG Xu-feng. Efficient Semi-global Binocular Stereo Matching Algorithm Based on PatchMatch [J]. Computer Science, 2021, 48(1): 204-208.
[4] ZHU Ling-ying, SANG Qing-bing, GU Ting-ting. No-reference Stereo Image Quality Assessment Based on Disparity Information [J]. Computer Science, 2020, 47(9): 150-156.
[5] CHEN Xiao-jun, XIANG Yang. Construction and Application of Enterprise Risk Knowledge Graph [J]. Computer Science, 2020, 47(11): 237-243.
[6] HE Xiao-jun, XU Ai-gong, LI Yu. Color Morphology Image Processing Method Using Similarity in HSI Space [J]. Computer Science, 2019, 46(4): 285-292.
[7] DU Juan, SHEN Si-yun. Implementation and Application of Stereo Matching Method Based onImproved Multi-weight Sliding Window [J]. Computer Science, 2019, 46(11A): 241-245.
[8] LI Yin-guo, ZHOU Zhong-kui, BAI Ling. Large-scale Automatic Driving Scene Reconstruction Based on Binocular Image [J]. Computer Science, 2019, 46(11A): 251-254.
[9] LI Guang-jing, BAO Hong, XU Cheng. Real-time Road Edge Extraction Algorithm Based on 3D-Lidar [J]. Computer Science, 2018, 45(9): 294-298.
[10] JIANG Ze-tao, WANG Qi, ZHAO Yan. Stereo Matching Algorithm Based on Adaptive Support Weight Optimization [J]. Computer Science, 2018, 45(8): 242-246.
[11] GUAN Qing and ZHANG Wei. Image Edge Detection Based on Fractal Dimension [J]. Computer Science, 2015, 42(6): 296-298.
[12] WANG Ya, CHEN Long, CAO Cong, WANG Ju and CAO Cun-gen. Method of Acquiring Event Commonsense Knowledge [J]. Computer Science, 2015, 42(10): 217-221.
[13] ZHANG Yan-feng,HUANG Xiang-sheng,LI Hang and WANG Meng-wei. Fast Stereo Matching Based on Progressive Reliable Point Growing Matching for Speckle Pattern Images [J]. Computer Science, 2014, 41(Z6): 143-146.
[14] ZHANG Bo-wen,TIAN Xiao-lin and SUN Yan-kui. Based on the Improved Mathematical Morphology OCT Image Quick Edge Detection Algorithm [J]. Computer Science, 2013, 40(Z6): 173-175.
[15] . Adaptive Multiple Windows Stereo Matching Algorithm [J]. Computer Science, 2012, 39(Z6): 519-521.
Full text



No Suggested Reading articles found!