计算机科学 ›› 2021, Vol. 48 ›› Issue (8): 157-161.doi: 10.11896/jsjkx.200700134

• 计算机图形学& 多媒体 • 上一篇    下一篇

基于深度学习SuperGlue算法的单目视觉里程计

刘帅1, 芮挺2, 胡育成1, 杨成松2, 王东2   

  1. 1 中国人民解放军陆军工程大学研究生院 南京210000
    2 中国人民解放军陆军工程大学野战工程学院 南京210000
  • 收稿日期:2020-07-21 修回日期:2020-08-25 发布日期:2021-08-10
  • 通讯作者: 芮挺(785344305@qq.com)
  • 基金资助:
    国家重点研发计划(2016YFC0802904)

Monocular Visual Odometer Based on Deep Learning SuperGlue Algorithm

LIU Shuai1, RUI Ting2, HU Yu-cheng1, YANG Cheng-song2, WANG Dong2   

  1. 1 School of Graduate,PLA Army Engineering University,Nanjing 210000,China;
    2 School of Field Engineering,PLA Army Engineering University,Nanjing 210000,China
  • Received:2020-07-21 Revised:2020-08-25 Published:2021-08-10
  • About author:LIU Shuai,born in 1994,postgraduate.His main research interests include computer vision machine learning and SLAM applications.(15737960205@163.com)RUI Ting,born in 1972,Ph.D,professor,master's advisor.His main research interests include image proces-sing,pattern recognition and artificial intelligence.
  • Supported by:
    National Key Research and Development Program of China(2016YFC0802904).

摘要: 基于特征点法的视觉里程计中,光照和视角变化会导致特征点提取不稳定,进而影响相机位姿估计精度,针对该问题,提出了一种基于深度学习SuperGlue匹配算法的单目视觉里程计建模方法。首先,通过SuperPoint检测器获取特征点,并对得到的特征点进行编码,得到包含特征点坐标和描述子的向量;然后,通过注意力GNN网络生成更具代表性的描述子,并创建M×N型得分分配矩阵,采用Sinkhorn算法求解最优得分分配矩阵,从而得到最优特征匹配;最后,根据最优特征匹配进行相机位姿恢复,采用最小化投影误差法进行相机位姿优化。实验结果表明,在无后端优化的条件下,该算法与基于ORB或SIFT算法的视觉里程计相比,不仅对视角和光线变化更鲁棒,而且其绝对轨迹误差和相对位姿误差的精度均有显著提升,进一步验证了基于深度学习的SuperGlue匹配算法在视觉SLAM中的可行性和优越性。

关键词: :视觉里程计, GNN, SuperGlue, 深度学习, 特征匹配

Abstract: Aiming at the visual odometer of feature point method,the change of illumination and view angle could lead to the instability of feature point extraction,which affects the accuracy of camera pose estimation,a monocular vision odometer modeling method based on deep learning SuperGlue matching algorithm is proposed.Firstly,the feature points are obtained by SuperPoint detector,and the resulting feature points are encoded to obtainvectors containing the coordinates and descriptors of the feature points.Then the more representative descriptors are generated by attentional GNN network.We useSinkhorn algorithm to solve the optimal score distribution matrix.Finally,according to the optimal feature matching,the camera pose is restored,and the ca-mera pose is optimized by using the minimum projection error equation.Experiments show that the proposed algorithm is not only more robust to view angle and light change than the visual odometer based on ORB or SIFT,without back-end optimization,but also the accuracy of absolute trajectory error and relative pose error is greatly improved,thus the feasibility and superiority of the deep learning based SuperGlue matching algorithm in visual slam are further verified.

Key words: Deep learning, Feature matching, GNN, SuperGlue, Visual odometer

中图分类号: 

  • TP391.9
[1]LU Y,SONG D.Visual navigation using heterogeneous land-marks and unsupervised .geo-metric constraints [J].IEEE Transactions on Robotics,2015,31(3):736-749.
[2]CHEN J W,BAUTEMBACH D,IZADI S.Scalable real-timevolumetric sur-face reconstruction[J].ACM Transactions on Graphics,2013,32(4):1-16.
[3]ABATE A F,NAPPI M,RICCIO D,et al.2D and 3D face recognition:a survey [J].Pattern Recognition Letters,2007,28(14):1885-1906.
[4]JIA K,CHAN T H,ZENG Z,et al.ROML:A robust feature correspondence approach for matching objects in a set of images [J].International Journal of Computer Vision,2016,117(2):173-197.
[5]SONG X,ZHAO X,HU H W,et al.EdgeStereo:a context integrated residual pyra-mid network for stereo matching[C]//Proc of Asian Conference on Computer Vision.2018.
[6]YI K,TRULLS E,LEPETIT V,et al.LIFT:Learned invariant feature transform[C]//Computer Vision(ECCV 2016).2016: 467-483.
[7]MIHAI D,IGNACIO R,TOMAS P et al.D2-Net:A trainable CNN for joint detection and description of local features[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2019;8092-8101.
[8]DANIEL D,TOMASZ M,A-NDREW R.Self-improvingvisualodometry[J].arXiv:1812.03245,2018.
[9]DANIEL D,TOMASZ M,ANDREW R.SuperPoint:Self-supervised interest point detection and description[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.2018:224-236.
[10]BALNTANS V,LENC K,VEDALDI A,et al.HPatches:Abenchmark and evaluation of handcrafted and learned local descriptors[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition (CVPR).2017:5173-5182.
[11]PAULE S,DANIEL D,TOMASZ M,et al.SuperGlue:Learning Feature Matching With Graph Neural Networks[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2020:4938-4947.
[12]DANIEL D,TOMASZ M,A-NDREW R.Deep image homography estimation[J].arXiv:1606.03798,2016.2.
[13]JUHO L,YOONHO L,JUNGTAEK K,et al.Set Transformer:A framework for attention-based permutation-invariant neural networks[C]//Proceedings of the 36th International Conference on Machine Learning(PMLR 97).2019:3744-3753.
[14]LUO Z X,SHEN T W,ZHOUL,et al.ContextDesc:Local descriptor augmentation with cross-modality context[C]//Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2019:2527-2536.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 胡安祥, 尹小康, 朱肖雅, 刘胜利.
基于数据流特征的比较类函数识别方法
Strcmp-like Function Identification Method Based on Data Flow Feature Matching
计算机科学, 2022, 49(9): 326-332. https://doi.org/10.11896/jsjkx.220200163
[5] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[6] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[7] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[8] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[9] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[10] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[14] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[15] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!