计算机科学 ›› 2023, Vol. 50 ›› Issue (1): 87-97.doi: 10.11896/jsjkx.211000118

• 计算机图形学&多媒体 • 上一篇    下一篇

基于稀疏点云分割的适应视角变化的场景识别方法

何雄辉1, 谭杰夫1, 刘哲1, 薛超3, 杨绍武1, 张拥军2   

  1. 1 国防科技大学计算机学院 长沙 410073
    2 国防科技创新研究院人工智能研究中心 北京 100166
    3 天津(滨海)人工智能创新中心 天津 300457
  • 收稿日期:2021-10-15 修回日期:2022-04-13 出版日期:2023-01-15 发布日期:2023-01-09
  • 通讯作者: 张拥军(yjzhang@nudt.edu.cn)
  • 作者简介:hxh@nudt.edu.cn
  • 基金资助:
    国家自然科学基金项目(91948303);天津市滨海新区合作共建研发平台科技项目(BHXQKJXM-PT-RGZNJMZX-2019001)

Viewpoint-tolerant Scene Recognition Based on Segmentation of Sparse Point Cloud

HE Xionghui1, TAN Jiefu1, LIU Zhe1, XUE Chao3, YANG Shaowu1, ZHANG Yongjun2   

  1. 1 College of Computer,National University of Defense Technology,Changsha 410073,China
    2 Artificial Intelligence Research Center,Defense Innovation Institute,Beijing 100166,China
    3 Tianjin(Binhai) Artificial Intelligence Innovation Center,Tianjing 300457,China
  • Received:2021-10-15 Revised:2022-04-13 Online:2023-01-15 Published:2023-01-09
  • About author:HE Xionghui,born in 1996,postgra-duate.His main research interests include place recognition in simultaneous localization and mapping.
    ZHANG Yongjun,born in 1966,Ph.D,professor.His main research interests include artificial intelligence,multi-agent cooperation,machine learning and feature recognition.
  • Supported by:
    National Natural Science Foundation of China(91948303) and Science and Technology Commission of Tianjin Binhai New Area(BHXQKJXM-PT-RGZNJMZX-2019001).

摘要: 在机器人自主导航中,同时定位与建图负责感知周围环境并定位自身位姿,为后续的高级任务提供感知支撑。场景识别作为其中的关键模块,可以帮助机器人更加准确地感知周围环境,它通过识别当前的观测和之前的观测是否属于同一个场景来校正传感器硬件固有误差导致的误差累积。现有的方法主要关注稳定视角下的场景识别,根据两个观测之间的视觉相似性来判断它们是否属于同一个场景。然而,当观测视角发生变化时,同一个场景的观测可能存在较大的视觉差异,使得观测之间可能只是局部相似,进而导致传统方法失效,因此,一种基于稀疏点云分割的场景识别方法被提出。它将场景进行分割,以解决局部相似的问题,并且结合视觉信息和几何信息实现准确的场景描述和匹配,使得机器人能识别出不同视角下的相同场景,支撑单机的回环检测模块或多机的地图融合模块。该方法基于稀疏点云分割将每个观测分割为若干部分,分割结果对视角具有不变性,并且从每个分割部分中提取出局部词袋向量和β角直方图来准确描述其场景内容,前者包含场景的视觉语义信息,后者包含场景的几何结构信息。之后,基于分割部分匹配观测之间的相同部分,丢弃不同部分,实现准确的场景内容匹配,提高场景识别的成功率。最后,在公开数据集上的结果表明,该方法在稳定视角和变化视角下的表现均优于在场景识别领域受到较多关注的词袋模型方法。

关键词: 视觉场景识别, 分割, 稀疏点云, 同时定位与建图

Abstract: In autonomous robot navigation,simultaneous localization and mapping is responsible for perceiving the surrounding environment and positioning itself,providing perceptual support for subsequent advanced tasks.Scene recognition,as a key mo-dule,can help the robot perceive the surrounding environment more accurately.It can correct the accumulated error caused by sensor error by identifying whether the current observation and the previous observation belong to the same scene.Existing me-thods mainly focus on scene recognition under the stable viewpoint,and judge whether two observations belong to the same scene based on the visual similarity between them.However,when the observation angle changes,there may be large visual differences in observations of the same scene,which may make the observations only partially similar,and this will lead to the failure of traditional methods.Therefore,a scene recognition method based on sparse point cloud segmentation is proposed.It divides the scene to solve local similar problems,and combines visual information and geometric information to achieve accurate scene description and ma-tching.So that the robot can recognize the same scene observation under different perspectives,which supports the loop detection for a single robot or the map fusion for multi-robot.This method divides each observation into several parts based on sparse point cloud segmentation.The segmentation result is invariant to the perspective,and each segment is extracted with a local bag of words vector and a β angle histogram to accurately describe its scene content.The former contains the visual semantic information of the scene.The latter contains the geometric structure information of the scene.Then,based on the segment,the same parts between observations are matched,the different parts are discarded to achieve accurate scene content matching and improve the success rate of place recognition.Finally,results on the public dataset show that this method outperforms the mainstream method bag of words in both stable and changing perspectives.

Key words: Visual scene recognition, Segmentation, Sparse point cloud, Simultaneous localization and mapping

中图分类号: 

  • TP391
[1]CADENA C,CARLONE L,CARRILLO H,et al.Past,present,and future of simultaneous localization and mapping:Toward the robust-perception age[J].IEEE Transactions on Robotics,2016,32(6):1309-1332.
[2]LOWRY S,SÜNDERHAUF N,NEWMAN P,et al.Visualplace recognition:A survey[J].IEEE Transactions onRobo-tics,2015,32(1):1-19.
[3]SAEEDI S,PAULL L,TRENTINI M,et al.Multiple robot simultaneous localization and mapping[C]//2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2011:853-858.
[4]RUBLEE E,RABAUD V,KONOLIGE K,et al.ORB:An efficient alternative to SIFT or SURF[C]//2011 International Conference on Computer Vision.IEEE,2011:2564-2571.
[5]LOWE D G.Object recognition from localscale-invariant fea-tures[C]//Proceedings of the seventh IEEE International Conference on Computer Vision.IEEE,1999,2:1150-1157.
[6]GÁLVEZ-LÓPEZ D,TARDOS J D.Bags of binary words forfast place recognition in image sequences[J].IEEE Transactions on Robotics,2012,28(5):1188-1197.
[7]GARCIA-FIDALGO E,ORTIZ A.ibow-lcd:An appearance-based loop-closure detection approach using incremental bags of binary words[J].IEEE Robotics and Automation Letters,2018,3(4):3051-3057.
[8]JÉGOU H,DOUZE M,SCHMID C,et al.Aggregating local descriptors into a compact image representation[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE,2010:3304-3311.
[9]TSINTOTAS K A,BAMPIS L,GASTERATOS A.Probabilisticappearance-based place recognition through bag of tracked words[J].IEEE Robotics and Automation Letters,2019,4(2):1737-1744.
[10]WANG H,WANG C,XIE L.Online Visual Place Recognition via Saliency Re-identification[J].arXiv:2007.14549,2020.
[11]BOGOSLAVSKYI I,STACHNISS C.Fast range image-basedsegmentation of sparse 3D laser scans for online operation[C]//2016 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).IEEE,2016:163-169.
[12]LI Y,SNAVELY N,HUTTENLOCHER D,et al.Worldwidepose estimation using 3d point clouds[C]//European Confe-rence on Computer Vision.Berlin:Springer,2012:15-29.
[13]CAMPOS C,ELVIRA R,RODRÍGUEZ J J G,et al.ORB-SLAM3:An accurate open-source library for visual,visual-inertial and multi-map SLAM[J].arXiv:2007.11898,2020.
[14]SCHLEGEL D,GRISETTI G.HBST:A hamming distance embedding binarysearch tree for feature-based visual place recognition[J].IEEE Robotics and Automation Letters,2018,3(4):3741-3748.
[15]ARANDJELOVIC R,GRONAT P,TORII A,et al.NetVLAD:CNN architecture for weakly supervised place recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5297-5307.
[16]HAN F,EL BELEIDY S,WANG H,et al.Learning of holism-landmark graph embedding for place recognition in long-term autonomy[J].IEEE Robotics and Automation Letters,2018,3(4):3669-3676.
[17]KNEIP L,SCARAMUZZA D,SIEGWART R.A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation[C]//CVPR 2011.IEEE,2011:2969-2976.
[18]STUMM E,MEI C,LACROIX S,et al.Location graphs for vi-sual place recognition[C]//2015 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2015:5475-5480.
[19]STUMM E S,MEI C,LACROIX S.Building location models for visual place recognition[J].The International Journal of Robo-tics Research,2016,35(4):334-356.
[20]MAFFRA F,TEIXEIRA L,CHEN Z,et al.Loop-closure detection in urban scenes for autonomous robot navigation[C]//2017 International Conference on 3D Vision(3DV).IEEE,2017:356-364.
[21]MAFFRA F,CHEN Z,CHLI M.tolerant PlaceRecognitioncombining 2D and 3D information for UAV navigation[C]//2018 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2018:2542-2549.
[22]MAFFRA F,TEIXEIRA L,CHEN Z,et al.Real-time wide-baseline place recognition using depth completion[J].IEEE Robotics and Automation Letters,2019,4(2):1525-1532.
[23]CHAUDHURI S,RITCHIE D,WU J,et al.Learning generative models of 3D structures[J].Computer Graphics Forum,2020,39(2):643-666.
[24]ZHENG L,ZHU C,ZHANG J,et al.Active scene understan-ding via online semantic reconstruction[J].Computer Graphics Forum,2019,38(7):103-114.
[25]ZHOU T,BROWN M,SNAVELY N,et al.Unsupervised lear-ning of depth and ego-motion from video[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1851-1858.
[26]SIVIC J,ZISSERMAN A.Video Google:A text retrieval ap-proach to object matching in videos[C]//IEEE International ConferenceonComputer Vision.IEEE Computer Society,2003:1470-1470.
[27]NISTER D,STEWENIUS H.Scalable recognition with a voca-bulary tree[C]//2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR'06).IEEE,2006:2161-2168.
[28]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? the kitti vision benchmark suite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:3354-3361.
[29]BURRI M,NIKOLIC J,GOHL P,et al.The EuRoC micro aerial vehicle datasets[J].The International Journal of Robotics Research,2016,35(10):1157-1163.
[30]QIN H F,LIU X.Finger Vein Image Segmentation Based onSparse Auto-Encoder[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2019,36(4):1-8.
[31]FANG L J,WANG K Q.A Visual SLAM Algorithm for Motion Redetection Combined with Deep Learning[J].Computer Engineering,2022,48(5):18-26.
[32]LIU X,WANG Z,QIN M X.Research Progress of Multi-Robot Collaborative SLAM Technology[J].Computer Enginering,2022,48(5):1-10.
[1] 马玮琦, 袁家斌, 查可可, 范利利.
一种基于脉冲神经网络的星体表面岩石检测算法
Onboard Rock Detection Algorithm Based on Spiking Neural Network
计算机科学, 2023, 50(1): 98-104. https://doi.org/10.11896/jsjkx.211100149
[2] 张汝佳, 代璐, 郭鹏, 王邦.
基于分割注意力与边界感知的中文嵌套命名实体识别算法
Chinese Nested Named Entity Recognition Algorithm Based on Segmentation Attention andBoundary-aware
计算机科学, 2023, 50(1): 213-220. https://doi.org/10.11896/jsjkx.211100257
[3] 杨文坤, 原晓佩, 陈小锋, 郭睿.
三维激光雷达点云空间多特征分割
Spatial Multi-feature Segmentation of 3D Lidar Point Cloud
计算机科学, 2022, 49(8): 143-149. https://doi.org/10.11896/jsjkx.210300275
[4] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[5] 单永峰, 蒋锐, 徐友云, 李大鹏.
一种面向全双工多中继协作SWIPT网络的功率消耗方案
Power Consumption Scheme Oriented to Full-duplex Multi-relay Cooperative SWIPT Networks
计算机科学, 2022, 49(7): 280-286. https://doi.org/10.11896/jsjkx.210400067
[6] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[7] 徐汝利, 黄樟灿, 谢秦秦, 李华峰, 湛航.
基于金字塔演化策略的彩色图像多阈值分割
Multi-threshold Segmentation for Color Image Based on Pyramid Evolution Strategy
计算机科学, 2022, 49(6): 231-237. https://doi.org/10.11896/jsjkx.210300096
[8] 胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇.
深度卷积神经网络图像实例分割方法研究进展
Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network
计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038
[9] 高心悦, 田汉民.
基于改进U-Net网络的液滴分割方法
Droplet Segmentation Method Based on Improved U-Net Network
计算机科学, 2022, 49(4): 227-232. https://doi.org/10.11896/jsjkx.210300193
[10] 徐玥, 周辉.
简单背景下基于OpenCV的静态手势识别
Static Gesture Recognition Based on OpenCV in Simple Background
计算机科学, 2022, 49(11A): 210800185-6. https://doi.org/10.11896/jsjkx.210800185
[11] 张欣, 孙静, 杨宏波, 潘家华, 郭涛, 王威廉.
基于TK能量算子和包络融合的心音分割算法
Heart Sound Segmentation Algorithm Based on TK Energy Operator and Envelope Fusion
计算机科学, 2022, 49(11A): 210900135-6. https://doi.org/10.11896/jsjkx.210900135
[12] 黄扬林, 胡凯, 郭建强, 彭诚.
基于多尺度特征融合和双重注意力机制的肝脏CT图像分割
Liver CT Images Segmentation Based on Multi-scale Feature Fusion and Dual AttentionMechanism
计算机科学, 2022, 49(11A): 210800162-9. https://doi.org/10.11896/jsjkx.210800162
[13] 祁颖, 柴艳妹.
基于改进的SLIC和聚类算法结合的高分辨率遥感海冰图像分割
High-resolution Remote Sensing Sea Ice Image Segmentation Based on Combination of ImprovedSLIC Algorithm and Clustering Algorithm
计算机科学, 2022, 49(11A): 211200100-6. https://doi.org/10.11896/jsjkx.211200100
[14] 金玉杰, 初旭, 王亚沙, 赵俊峰.
变分推断域适配驱动的城市街景语义分割
Variational Domain Adaptation Driven Semantic Segmentation of Urban Scenes
计算机科学, 2022, 49(11): 126-133. https://doi.org/10.11896/jsjkx.220500193
[15] 张福昌, 仲国强, 毛玉旭.
面向轻量化医学图像分割网络的神经结构搜索
Neural Architecture Search for Light-weight Medical Image Segmentation Network
计算机科学, 2022, 49(10): 183-190. https://doi.org/10.11896/jsjkx.210800052
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!