计算机科学 ›› 2024, Vol. 51 ›› Issue (4): 229-235.doi: 10.11896/jsjkx.230100137

• 计算机图形学&多媒体 • 上一篇    下一篇

基于GAANET的立体匹配算法

宋昊, 毛宽民, 朱洲   

  1. 华中科技大学机械科学与工程学院 武汉430074
  • 收稿日期:2023-01-31 修回日期:2023-04-20 出版日期:2024-04-15 发布日期:2024-04-10
  • 通讯作者: 朱洲(zhuzhou@hust.edu.cn)
  • 作者简介:(hsong@hust.edu.cn)
  • 基金资助:
    宁夏回族自治区重点研发计划(2021BFH02001)

Algorithm of Stereo Matching Based on GAANET

SONG Hao, MAO Kuanmin, ZHU Zhou   

  1. School of Mechanical Science & Engineering,Huazhong University of Science and Technology,Wuhan 430074,China
  • Received:2023-01-31 Revised:2023-04-20 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    Key R & D Program of Ningxia Hui Autonomous Region,China(2021BFH02001).

摘要: 端到端的立体匹配算法在计算时间和匹配效果上均有一定的优势,近年来在立体匹配任务中得到了广泛的应用。但特征提取的过程中存在特征冗余、信息丢失,以及多尺度特征融合不充分等问题,造成算法的计算量和复杂度偏高,也影响了匹配的精度。针对上述问题,在自适应聚合网络AANET的基础上,设计了更加适合立体匹配的特征提取模块,提出了改进的幽灵自适应聚合网络GAANET。采用G-Ghost阶段提取多尺度的特征,通过廉价操作生成部分特征,减少特征的冗余现象并有效保存浅层特征;采取高效的通道注意力机制,将不同的权重分配到每个通道中;采取改进的特征金字塔结构,缓解传统金字塔中的通道信息丢失并优化融合特征,为各个尺度的特征进行丰富的信息补充。在SceneFlow,KITTI2015和KITTI2012数据集上进行训练和评估,评估结果显示,与基础方法相比,所提改进算法的精度分别提升了0.92%,0.25%和0.20%,且参数量减少了13.75%,计算量减少了4.8%。

关键词: 立体匹配, 特征提取, 端到端立体匹配网络, 注意力模块, 深度学习

Abstract: End-to-end stereo matching algorithms have become increasingly popular in stereo matching tasks due to their advantages in computational time and matching accuracy.However,feature extraction in such algorithms can result in redundant features,information loss,and insufficient multi-scale feature fusion,thereby increasing computational complexity and decreasing matching accuracy.To address these challenges,an improved ghost adaptive aggregation network(GAANET) is proposed based on the adaptive aggregation network(AANET),and its feature extraction module is improved to make it more suitable for stereo matching tasks.Multi-scale features are extracted in the G-Ghost phase,and partial features are generated through low-cost ope-rations to reduce feature redundancy and preserve shallow features.An efficient channel attention mechanism is implemented to allocate weights to each channel,and an improved feature pyramid structure is introduced to mitigate channel information loss in traditional pyramids and optimize feature fusion,thus enhancing information supplement for features across scales.The proposed GAANET model is trained and evaluated on the SceneFlow,KITTI2015,and KITTI2012 datasets.Experimental resultsdemons-trate that GAANET outperforms the baseline method,with accuracy improvements of 0.92%,0.25%,and 0.20%,respectively,while reducing parameter volume by 13.75% and computational complexity by 4.8%.

Key words: Stereo matching, Feature extraction, End-to-end stereo matching network, Attention module, Deep learning

中图分类号: 

  • TP391
[1]LIN X,WANG J,LIN C.Research on 3D Reconstruction in Bi-nocular Stereo Vision Based on Feature Point Matching Method[C]//2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education(ICISCAE).Dalian:IEEE,2020:551-556.
[2]CHEN Y,ZHAO L W,ZHAN H C,et al.Study on Reconstruction of Indoor 3D Scene Based on Binocular Vision[J].Computer Science,2020,47(11A):175-177.
[3]CHANG Z T,SHI Y Q,WANG J,et al.Vehicle Speed Measure-ment Method Based on Binocular Vision[J].Computer Science,2021,48(9):135-139.
[4]DE SANTANA J R,CAMBUIM L F S,BARROS E.Bi-Window Based Stereo Matching Using Combined Siamese Convolutional Neural Network[C]//13th International Conference on Digital Image Processing(ICDIP).Electr Network:SPIE,2021:118781Z.
[5]SEKI A,POLLEFEYS M.SGM-Nets:Semi-global matchingwith neural networks[C]//30th IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Honolulu:IEEE,2017:6640-6649.
[6]JIE Z Q,WANG P F,LING Y G,et al.Left-Right Comparative Recurrent Model for Stereo Matching[C]//31st IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Salt Lake City:IEEE,2018:3838-3846.
[7]WANG Y,LAI Z H,HUANG G,et al.Anytime Stereo Image Depth Estimation on Mobile Devices[C]//IEEE International Conference on Robotics and Automation(ICRA).Montreal:IEEE,2019:5893-5900.
[8]KHAMIS S,FANELLO S,RHEMANN C,et al.Stereonet:Guided hierarchical refinement for real-time edge-aware depth prediction[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:573-590.
[9]LIU S G,ZHANG T,YANG J G,et al.Progressive dialtion residual network for deep binocular stereo matching[J].Journal of Xidian University,2022,49(5):175-180.
[10]LI T,MA W,XU S B,et al.Task-Adaptive End-to-End Net-works for Stereo Matching[J].Journal of Computer Research and Development,2020,57(7):1531-1538.
[11]HAN K,WANG Y,XU C,et al.GhostNets on Heterogeneous Devices via Cheap Operations[J].International Journal of Computer Vision,2022,130(4):1050-1069.
[12]WANG Q,WU B,ZHU P,et al.ECA-Net:Efficient Channel Attention for Deep Convolutional Neural Networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle:IEEE,2020:11534-11542.
[13]LUO Y,CAO X,ZHANG J,et al.CE-FPN:enhancing channel information for object detection[J].Multimedia Tools and Applications,2022,81(21):30685-30704.
[14]XU H F,ZHANG J Y.AANet:Adaptive Aggregation Network for Efficient Stereo Matching[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Electr Network:IEEE,2020:1956-1965.
[15]CHABRA R,STRAUB J,SWEENEY C,et al.StereoDRNet:Dilated Residual StereoNet[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach:IEEE,2019:11778-11787.
[16]DAI J,QI H,XIONG Y,et al.Deformable Convolutional Networks[C]//2017 IEEE International Conference on Computer Vision(ICCV).Venice:IEEE,2017:764-773.
[17]CHENG X,ZHONG Y,HARANDI M,et al.Hierarchical neural architecture search for deep stereo matching[J].Advances in Neural Information Processing Systems,2020,33:22158-22169.
[18]SHEN Z,DAI Y C,SONG X B,et al.PCW-Net:Pyramid Combination and Warping Cost Volume for Stereo Matching[C]//17th European Conference on Computer Vision(ECCV).Tel Aviv:Springer,2022:280-297.
[19]SHEN Z L,DAI Y C,RAO Z B,et al.CFNet:Cascade and Fused Cost Volume for Robust Stereo Matching[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Electr Network:IEEE,2021:13901-13910.
[20]WANG Q,SHI S H,ZHENG S Z,et al.FADNet:A Fast and Accurate Network for Disparity Estimation[C]//IEEE International Conference on Robotics and Automation(ICRA).Electr Network:IEEE,2020:101-107.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!