计算机科学 ›› 2022, Vol. 49 ›› Issue (10): 207-213.doi: 10.11896/jsjkx.210900066

• 计算机图形学&多媒体 • 上一篇    下一篇

基于环境信息挖掘的体素形变网络

刘娜丽1,3, 田彦1,3, 宋亚东4, 江腾飞3, 王勋1,2, 杨柏林1   

  1. 1 浙江工商大学计算机与信息工程学院 杭州 310018
    2 之江实验室 杭州 311121
    3 先临三维科技股份有限公司研究院 杭州 310013
    4 华东理工大学信息科学与工程学院 上海 200237
  • 收稿日期:2021-09-08 修回日期:2022-01-16 出版日期:2022-10-15 发布日期:2022-10-13
  • 通讯作者: 田彦(tianyan@zjgsu.edu.cn)
  • 作者简介:(2473447208@qq.com)
  • 基金资助:
    国家重点研发计划(2018YFB1404102,2018YFB1403200);国家自然科学基金(61972351,61976188,61972353);浙江省自然科学基金(LY19F030005);北京航天航空大学开放课题基金(VRLAB2020B15);之江实验室资助项目(2019KD0AC02)

Voxel Deformation Network Based on Environmental Information Mining

LIU Na-li1,3, TIAN Yan1,3, SONG Ya-dong4, JIANG Teng-fei3, WANG Xun1,2, YANG Bai-lin1   

  1. 1 School of Computer Science & Information Engineering,Zhejiang Gongshang University,Hangzhou 310018,China
    2 Zhejiang Lab,Hangzhou 311121,China
    3 Shining 3D Research,Shining 3D Tech Co.,Ltd,Hangzhou 310013,China
    4 School of Information Science Engineering,East China University of Science and Technology,Shanghai 200237,China
  • Received:2021-09-08 Revised:2022-01-16 Online:2022-10-15 Published:2022-10-13
  • About author:LIU Na-li,born in 1996,postgraduate,is a member of China Computer Federation.Her main research interest include computer vision and deep learning.
    TIAN Yan,born in 1982,Ph.D,associate professor,master supervisor,is a member of China Computer Federation.His main research interests include machine learning and video analysis.
  • Supported by:
    National Key R&D Program of China(2018YFB1404102,2018YFB1403200),National Natural Science Foundation of China(61972351,61976188,61972353),National Natural Science Foundation of Zhejiang Province(LY19F030005),Opening Foundation of State Key Laboratory of Virtual Reality Technology and System of Beihang University,China(VRLAB2020B15) and Zhejiang Laboratory Funded Project(2019KD0AC02).

摘要: 3D形变技术是计算机图形学领域的研究热点之一。当前的3D形变方法主要通过聚合局部相邻的体素特征来学习物体形变前后的变化,未充分挖掘非局部体素特征之间的相互关系,这种环境信息的缺失导致模型无法捕获更具辨识性的特征。针对上述问题,设计了一种基于环境信息挖掘的体素形变网络,该网络能够同时对局部和环境信息进行提取,从不同的空间域中挖掘环境信息以提升网络的表征性能,进而建模物体形变前后的变化关系。引入自注意力机制,通过学习特征空间中不同体素的非局部依赖性,以提升体素特征的辨别力;引入一种多尺度分析方法,使用不同扩张率的空洞卷积分别提取不同感知域中的环境信息,为模型提供了更丰富的上下文特征。此外,文中分析了特征融合对模型的影响,并设计了一种基于编码器-解码器特征融合方法,自适应地对编码器和解码器提取的特征进行融合,提高了模型的非线性映射能力。在自建的齿科数据集上进行了充分的对比实验,结果表明,与现有方法相比,所提方法在形变预测任务的准确率上有一定的提升。

关键词: 形状变形, 体素, 注意力机制, 特征融合, 多尺度分析

Abstract: The technique of 3D deformation is one of the hot topics in the field of computer graphics.Current 3D deformation methods mainly learn the changes before and after deformation by aggregating localized adjacent voxel features,and fail to exploit the interrelationship between non-local voxel features,and the absence of contextual information prevents the model from capturing more discriminative features.To address the above problems,this paper designs a voxel deformation network based on environmental information mining,which can extract local and environmental information simultaneously,and extract environmental information from different spatial domains to improve the representation performance of the network,further modeling the relationship before and after the deformation of the object.Firstly,a novel self-attention mechanism is introduced.Specifically,the learning of the non-local dependence of different voxels is proposed to improve the ability of voxel discrimination.Then,a multi-scale analysis method is introduced to extract environmental information in different perceptual fields via multiple dilated convolution with different dilation rates,which provides more informative contextual features for the subsequent models.In addition,this paper analyzes the impact of feature fusion on the model and designs a method based on encoder-decoder feature fusion,which adaptively fuses the features extracted from the encoder and decoder to improve the nonlinear mapping capability of the model.Extensive experiments are conducted on our tooth dataset.The results show that the deformation prediction accuracy of the proposed method is improved compared to existing methods.

Key words: Shape deformation, Voxel, Attention mechanism, Feature fusion, Multiscale analysis

中图分类号: 

  • TP391
[1]WANG W,CEYLAN D,MECH R,et al.3DN:3D Deformation Network[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.Long Beach:IEEE Press,2019:1038-1046.
[2]YUMER M E,MITRA N J.Learning Semantic DeformationFlows with 3D Convolutional Networks[C]//Proceedings of the European Conference on Computer Vision.Amsterdam:Springer International Publishing Press,2016:294-311.
[3]SUMNER R W,POPVIC J.Deformation Transfer for Triangle Meshes [J].ACM Transactions on Graphics,2004,23(3):399-405.
[4]CHU H K,LIN C H.Example-based Deformation Transfer for 3D Polygon Models[J].Journal of Information Science and Engineering,2010,26(2):379-391.
[5]GROUEIX T,FISHER M,KIM V G,et al.3D-CODED:3D Correspondences by Deep Deformation[C]//Proceedings of the European Conference on Computer Vision.Munich:Springer International Publishing Press,2018:230-246.
[6]YE Y T.Research of Deep Neural Network Based Face Guided Image Completion[D].Harbin:Harbin Institute of Technology.2018.
[7]RU X Q,HUA G G,LI L H,et al.Handwritten Digital Recognition Based on Deformable Convolutional Neural Network[J].Microeletronics&Computer,2019,36(4):47-51.
[8]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE Computer Society Press,2017:77-85.
[9]TIAN Y,CHEN T,CHENG G,et al.Global Context Assisted Structure-Aware Vehicle Retrieval[J].IEEE Transactions on Intelligent Transportation Systems,2020,23(1):165-174.
[10]TIAN Y,CHENG G,GELERNTER J,et al.Joint TemporalContext exploitation and Active Learning for Video segmentation[J].Pattern Recognition,2020,100:107158.
[11]TIAN Y,ZHANG Y,ZHOU D,et al.Triple Attention Network For video Segmentation[J].Neuro Computing,2020,417:202-211.
[12]TIAN Y,GELERNTER J,WANG X,et al.Traffic Sign Detection Using a Multi-Scale Recurrent Attention Network[J].IEEE Transactions on Intelligent Transportation Systems,2019,20(12):565-571.
[13]MILLETARI F,NAVAB N,AHMADI S A.V-Net:Fully Convo-lutional Neural Networks for Volumetric Medical Image Segmentation[C]//Proceedings of Fourth International Conference on 3D Vision.Stanford:IEEE Computer Society,2016:565-571.
[14]NOH J,NEUMANN U.Expression Cloning[C]//Proceedings of the 28td Annual Conference on Computer Graphics and Interactive Techniques.New York:Association for Computing Machinery Press,2001:277-288.
[15]TAN Q,GAO L,LAI Y K,et al.Mesh-Based Autoencoders for Localized Deformation Component Analysis[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New Orleans:AAAI Press.2018:2452-2459.
[16]GAO L,YANG J,QIAO Y L,et al.Automatic UnpairedShape Deformation Transfer[J].ACM Transactions on Gra-phics,2018,237:1-15.
[17]RONNEBERGER O,FISCHER P,BROX T.U-Net:Convolu-tional Networks for Biomedical Image Segmentation[C]//International Conference on Medical Image Computing and Compu-ter-Assisted Intervention.Munich:Springer International Publishing Press,2015:234-241.
[18]WANG X,GIRSHICK R,GUPTA A,et al.Non-local NeuralNetworks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake:IEEE Press,2018:7794-7803.
[19]WERNER D,AL-HAMADI A,WERNER P.Truncated Signed Distance Function:Experiments on Voxel Size[C]//Interna-tional Conference Image Analysis and Recognition.Algarve:Springer International Publishing Press,2014:357-364.
[20]YIN K,CHEN Z,HUANG H,et al.LOGAN:Unpaired Shape Transform in Latent Overcomplete Space[J].ACM Transactions on Graphics,2019,38(6):1-13.
[21]SHIMADA S,GOLYANIK V,TRETSCHK E,et al.Dispvoxnets:Non-Rigid Point Set Alignment with Supervised Learning Proxies[C]//2019 International Conference on 3D Vision.Cana-da:IEEE Press,2019:27-36.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[5] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[6] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[9] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[10] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[11] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[12] 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚.
融合双向门控循环单元和注意力机制的软件自承认技术债识别方法
Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism
计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[13] 彭双, 伍江江, 陈浩, 杜春, 李军.
基于注意力神经网络的对地观测卫星星上自主任务规划方法
Satellite Onboard Observation Task Planning Based on Attention Neural Network
计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
[14] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[15] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!