IBSNet:用于估计单视角扫描点云交互平分面的神经隐式场

doi:10.11896/jsjkx.240900086

Abstract

Abstract: The analysis of spatial relationships between 3D objects is of great significance for scene understanding and interaction.For example,by analyzing the spatial relationship between the robot and the object,the robot can be guided to grasp the object more accurately.By learning the spatial relationship between objects in the real scene,it can guide the generation of virtual scenes that look more natural or better meet the needs of interaction.However,because the single-view scanned point clouds gotten by RGB-D cameras or LiDAR usually have many artifacts and noise,existing methods for analyzing the spatial relationships of objects are often difficult to make accurate predictions when faced with single-view scanned point clouds,which makes these me-thods impractical for practical applications.For handling the spatial relationship analysis of single-view scanned point clouds,this paper uses the interaction bisector surface(IBS) to express spatial relationships,and proposes a differential unsigned distance field of dual-object to implicitly represent IBS.Inspired by the implicit function learning methods widely used in recent years,this paper designs a neural implicit field to fit the differential unsigned distance field.This neural implicit field takes the single-view scanned point clouds of two objects as input and returns the different unsigned distance field of the two objects.This network uses two multi-layer self-attention point cloud encoders to extract the features of the two input point clouds and combines these features after that.Then these features are inputted into a dual-object unsigned distance decoder to get the unsigned distance va-lue of the query points.Comparative experiments of this method with other methods(Geometry Method,IMNet and Grasping Field) are conducted on the ICON dataset.It simulates single-view scans of each scene from 26 different viewpoints to get the single-view scanned point clouds and split the whole dataset into training set and test set based on a single scene.The robustness of each method is also tested when facing single-view scanning point clouds with different degrees of incompleteness and noise.Experimental results show that theproposed neural implicit field is very robust to the input single-view scanned point clouds with different degrees of incompleteness,and can efficiently predict IBS with accurate shapes.

Key words: Spatial relationship analysis, Interaction bisector surface, Single-view scanned point cloud, Neural implicit field, Unsigned distance field

CLC Number:

TP391

YUAN Youwen, JIN Shuo, ZHAO Xi. IBSNet:A Neural Implicit Field for IBS Prediction in Single-view Scanned Point Cloud[J].Computer Science, 2025, 52(8): 195-203.

References

[1]HUANG Z Y,XU J Z,DAI S S,et.al.NIFT:Neural interaction field and template for object manipulation[C]//2023 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2023:1875-1881.
[2]SHE Q J,HU R Z,XU J Z,et.al.Learning high-DOF reaching-and-grasping via dynamic representation of gripper-object interaction[J].ACM Transactions on Graphics,2022,41(4):1-14.
[3]ZHAO X,WANG H,KOMURA T,et.al.Indexing 3D Scenes Using the Interaction Bisector Surface[J].ACM Transactions on Graphics,2014,33(3):1-14.
[4]PARK J J,FLORENCE P,STRAUB J,et al.DeepSDF:Lear-ning continuous signed distance functions for shape representation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:165-174.
[5]CHEN Z Q,ZHANG H.Learning implicit fields for generative shape modeling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5939-5948.
[6]SITZMANN V,ZOLLHÖFER M,WETZSTEIN G,et.al.Scene representation networks:Continuous 3d-structure-aware neural scene representations[J].arXiv:1906.01618,2019.
[7]WANG P,LIU L J,LIU Y,et.al.NeuS:Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction[C]//NIPS 2021.2021:27171-27183.
[8]FU Q C,XU Q S,ONG Y S,et al.Geo-Neus:Geometry-consistent neural implicit surfaces learning for multi-view reconstruction[J].Advances in Neural Information Processing Systems,2022,35:3403-3416.
[9]CHIBANE J,MIR A,PONS-MOLL G.Neural unsigned distancefields for implicit function learning[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems.2020:21638-21652.
[10]HU R,ZHU C,VAN KAICK O,et al.Interaction context(ICON):towards a geometric functionality descriptor[J].ACM Transactions on Graphics,2015,34(4):1-12.
[11]WU Z,SONG S,KHOSLA A,et al.3D ShapeNets:A Deep Re-presentation for Volumetric Shapes[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2015:1912-1920.
[12]MATURANA D,SCHERER S.VoxNet:A 3D ConvolutionalNeural Network for real-time object recognition[C]//2015 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).2015:922-928.
[13]RIEGLER G,ULUSOY A O,GEIGER A.OctNet:LearningDeep 3D Representations at High Resolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:3577-3586.
[14]CHARLES R Q,SU H,KAICHUN M,et al.PointNet:Deep Learning on Point Sets for 3D Classificationand Segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:652-660.
[15]ZHAO H,JIANG L,JIA J,et al.Point Transformer.[C]//2021 IEEE/CVF International Conferenceon Computer Vision(ICCV).2021:16259-16268.
[16]GUO M H,CAI J X,LIU Z N,et al.PCT:Point cloud transformer[J].Computational Visual Media,2021(7):187-199.
[17]LIU M Y,YANG Q M,HU G H,et al.3D point cloud object detection algorithm based on Transformer[J].Journal of Northwestern Polytechnical University,2023,41(6):1190-1197.
[18]LIU X H,BAI Z Y,XU Z,et al.Multi-guided Point CloudRegistration Network Combined with Attention Mechanism[J].Computer Science,2024,51(2):142-150.
[19]KARUNRATANAKUL K,YANG J,ZHANG Y,et al.Gras-ping Field:Learning Implicit Representations forHuman Grasps[C]//2020 International Conference on 3D Vision(3DV).2020:333-344.
[20]ZHAO X,ZHANG B,WU J,et al.Relationship-Based PointCloud Completion[J].IEEE Transactions on Visualization and Computer Graphics,2022,28(12):4940-4950.
[21]HUANG Z Y,DAI S S,XU K,et.al.DINA:Deformable INteraction Analogy[J].Graphical Models,2024,133:101217.
[22]XUAN H B,LI X Z,ZHANG J S,et al.Narrator:TowardsNatural Control of Human-Scene Interaction Generation via Relationship Reasoning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:22268-22278.
[23]ZHAO X,HU R,GUERRERO P,et al.Relationship templates for creating scene variations[J].ACM Transactions on Gra-phics,2016,35(6):1-13.
[24]HUANG Z,XU J,DAI S,et al.NIFT:Neural Interaction Field and Template for Object Manipulation [C]//2023 IEEE International Conference on Robotics and Automation(ICRA).2022:1875-1881.
[25]WALD J,DHAMO H,NAVAB N,et al.Learning 3d semantic scene graphs from 3D indoor reconstructions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:3961-3970.
[26]LIU Y Y,LONG C J,ZHANG Z X,et al.Explore Contextual Information for 3D Scene Graph Generation[J].IEEE Transactions on Visualization and Computer Graphics,2023,29(12):5556-5568.
[27]CHABRA R,LENSSEN J E,ILG E,et al.Deep Local Shapes:Learning Local SDF Priors for Detailed 3D Reconstruction[C]//Computer Vision－ECCV 2020,Lecture Notes in Computer Science.2020:608-625.
[28]CHEN Z,ZHANG H.Learning Implicit Fields for GenerativeShape Modeling[C]//2019 IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition(CVPR).2019:5939-5948.
[29]YUAN W,KHOT T,HELD D,et al.PCN:Point Completion Network[C]//2018 International Conference on 3D Vision(3DV).2018:728-744.

Related Articles 15

[1]	ZHAO Shengyu, PENG Jiaheng, WANG Wei, HUANG Fan. OpenRank Dynamics:Influence Evaluation and Dynamic Propagation Models for Open SourceEcosystems [J]. Computer Science, 2025, 52(8): 62-70.
[2]	YANG Jian, SUN Liu, ZHANG Lifang. Survey on Data Processing and Data Augmentation in Low-resource Language Automatic Speech Recognition [J]. Computer Science, 2025, 52(8): 86-99.
[3]	WANG Pei, YANG Xihong, GUAN Renxiang, ZHU En. Deep Graph Contrastive Clustering Algorithm Based on Dynamic Threshold Pseudo-label Selection [J]. Computer Science, 2025, 52(8): 100-108.
[4]	CHEN Genshen, LIU Gang, DONG Yang, FAN Wenyao, YI Qiang, JIANG Zixin. Efficient Indexing Method for Massive 3D Geological Block Models Based on Inverted-B＋ Tree [J]. Computer Science, 2025, 52(8): 146-153.
[5]	ZHANG Shiju, GUO Chaoyang, WU Chengliang, WU Lingjun, YANG Fengyu. Text Clustering Approach Based on Key Semantic Driven and Contrastive Learning [J]. Computer Science, 2025, 52(8): 171-179.
[6]	DING Zhengze, NIE Rencan, LI Jintao, SU Huaping, XU Hang. MTFuse:An Infrared and Visible Image Fusion Network Based on Mamba and Transformer [J]. Computer Science, 2025, 52(8): 188-194.
[7]	LIU Huayong, XU Minghui. Hash Image Retrieval Based on Mixed Attention and Polarization Asymmetric Loss [J]. Computer Science, 2025, 52(8): 204-213.
[8]	LIU Jian, YAO Renyuan, GAO Nan, LIANG Ronghua, CHEN Peng. VSRI:Visual Semantic Relational Interactor for Image Caption [J]. Computer Science, 2025, 52(8): 222-231.
[9]	WANG Fengling, WEI Aimin, PANG Xiongwen, LI Zhi, XIE Jingming. Video Super-resolution Model Based on Implicit Alignment [J]. Computer Science, 2025, 52(8): 232-239.
[10]	WANG Jia, XIA Ying, FENG Jiangfan. Few-shot Video Action Recognition Based on Two-stage Spatio-Temporal Alignment [J]. Computer Science, 2025, 52(8): 251-258.
[11]	LI Junwen, SONG Yuqiu, ZHANG Weiyan, RUAN Tong, LIU Jingping, ZHU Yan. Cross-lingual Information Retrieval Based on Aligned Query [J]. Computer Science, 2025, 52(8): 259-267.
[12]	ZHANG Yuan, ZHANG Shengjie, LIU Lilong, QIAN Shengsheng. Research on Continual Social Event Classification Based on Continual Event Knowledge Network [J]. Computer Science, 2025, 52(8): 268-276.
[13]	LIU Le, XIAO Rong, YANG Xiao. Application of Decoupled Knowledge Distillation Method in Document-level RelationExtraction [J]. Computer Science, 2025, 52(8): 277-287.
[14]	CHEN Ge, WANG Zhongqing. Cross-domain Aspect-based Sentiment Analysis Based on Pre-training Model with Data Augmentation [J]. Computer Science, 2025, 52(8): 300-307.
[15]	WANG Dongsheng. Multi-defendant Legal Judgment Prediction with Multi-turn LLM and Criminal Knowledge Graph [J]. Computer Science, 2025, 52(8): 308-316.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

IBSNet:A Neural Implicit Field for IBS Prediction in Single-view Scanned Point Cloud

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0