融合Transformer与多阶段学习框架的点云上采样网络

doi:10.11896/jsjkx.230300154

Abstract

Abstract: Drawing on Transformer’s powerful feature encoding capabilities in the fields of natural language and computer vision,and inspired by a multi-stage learning framework,a point cloud upsampling network that incorporates Transformer and multi-stage learning framework is designed.The network adopts a two-stage network model,the first stage is a dense point generation network,using a multi-layer Transformer encoder to progressively transform the local geometric information and local feature information of the input point cloud to the high-level semantic features of the point cloud,the feature expansion module upsamples the point cloud features in the feature space,the coordinate regression module remaps the point cloud from the feature space back to the Euclidean space to initially generate a dense point cloud.The second stage is the point-by-point optimisation network,using the Transformer encoder to encode the latent semantic features in the dense point cloud,and combining the semantic features from the previous stage to obtain the complete semantic features of the point cloud,the information integration module extracts the error features of the points from the geometric information and semantic features of the dense point cloud,and the error regression module calculates the coordinate offset of the points in Euclidean space from the error features to realise the point-by-point optimisation of the dense point cloud,so that the distribution of points on the point cloud is more uniform and closer to the real object surface.In extensive experiments on the large synthetic dataset PU1K,the high-resolution point clouds generated by MSPUiT are reduced to 0.501×10^－3,5.958×10^－3 and 1.756×10^－3 in terms of Chamfer Distance(CD),Hausdorff Distance(HD) and distance from the generated point cloud to the original point cloud block(P2F),respectively.Experimental results show that the surface of the point cloud is smoother and less noisy after upsampling by MSPUiT,and the quality of the generated point cloud is higher than that of the current mainstream point cloud upsampling networks.

Key words: Transformer encoder, Multi-stage learning framework, Feature conversion, Point cloud upsampling, Deep learning

CLC Number:

TP391

LI Zekai, BAI Zhengyao, XIAO Xiao, ZHANG Yihan, YOU Yilin. Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework[J].Computer Science, 2024, 51(6): 231-238.

References

[1]CHE A B,ZHANG H,LI C,et al.Single-stage 3D Object Detector in Traffic Environment Based on Point Cloud Data[J].Computer Science,2022,49(S2):567-572.
[2]ZHAO X C,CHANG H X,JIN R B.3D Point Cloud Shape Completion GAN[J].Computer Science,2021,48(4):192-196.
[3]QI S H,XU H G,WAN Y W,et al.Construction of Semantic Mapping in Dynamic Environments[J].Computer Science,2020,47(9):198-203.
[4]FOIX S,ALENYA G,TORRAS C.Lock-in Time-of-Flight(ToF) cameras:a survey[J].IEEE Sensors Journal,2011,11(9):1917-1926.
[5]SCHUON S,THEOBALT C,DAVIS J,et al.High-quality scanning using time-of-flight depth superresolution[C]//Procee-dings of 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.IEEE,2008:1-7.
[6]RICHARDSON J,WALKER R,GRANT L,et al.A 32×3250 ps resolution 10 bit time to digital converter array in 130 nm CMOS for time correlated imaging[C]//Proceedings of 2009 IEEE Custom Integrated Circuits Conference.Washington D.C,USA:IEEE Press,2009:77-80.
[7]ALEXA M,BEHR J,COHEN-OR D,et al.Computing and rendering point set surfaces[J].IEEE Transactions on Visualization and Computer Graphics,2003,9(1):3-15.
[8]LIPMAN Y,COHEN-OR D,LEVIN D,et al.Parameterization-free projection for geometry reconstruction[J].ACM Transactions on Graphics,2007,26(3):22:1-5.
[9]HUANG H,LI D,ZHANG H,et al.Consolidation of unorga-nized point clouds for surface reconstruction[J].ACM Transactions on Graphics,2009,28(5):176:1-8.
[10]HUANG H,WU S,GONG M,et al.Edge-aware point set resam-pling[J].ACM Transactions on Graphics,2013,32(1):9:1-9:12.
[11]WU S H,HUANG H,GONG M L,et al.Deep points consolidation[J].ACM Transactions on Graphics,2015,34(6) 176:1-176:7.
[12]LI R H,LI X Z,HENG P A,et al.Point Cloud Upsampling via Disentangled Refinement[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2021:344-353.
[13]YU L Q,LI X Z,FU C W,et al.PU-Net:Point Cloud Upsampling Network[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018:2790-2799.
[14]WANG Y F,WU S H,HUANG H,et al.Patch-based Progressive 3D Point Set Upsampling[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:5951-5960.
[15]LI R H,LI X Z,FU C W,et al.PU-GAN:a Point Cloud Upsampling Adversarial Network[C]//Proceeding of IEEE International Conference on Computer Vision.2019:7203-7212.
[16]QIAN G C,ABDULELLAH A,LI G H.PU-GCN:Point Cloud Upsampling using Graph Convolutional Networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2021:11683-11692.
[17]HAN B,ZHANG X,REN S.PU-GACNet:Graph AttentionConvolution Network for Point Cloud Upsampling[J].Image and Vision Computing,2022,118:104371.
[18]GU F,ZHANG C L,WANG H Y,et al.PU-WGCN:PointCloud Upsampling Using Weighted Graph Convolutional Networks[J].Remote Sensing,2022,14(21):5356.
[19]LIU Y L,WANG Y M,LIU Y.Refine-PU:A Graph Convolutional Point Cloud Upsampling Network using Spatial Refinement[C]//Proceeding of the 2022 IEEE International Confe-rence on Visual Communications and Image Processing(VCIP).2022:1-5.
[20]ASHISH V,NOAM S,NIKI P,et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing.2017:6000-6010.
[21]YANG F Z,YANG H,FU J L,et al.Learning texture transformer network for image super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:5791-5800.
[22]ALEXEY D,LUCAS B,ALEXANDER K,et al.An image isworth 16x16 words:Transformers for image recognition at scale[EB/OL].https://arxiv.org/abs/2010.11929.
[23]NICOLAS C,FRANCISCO M,SYNNAEV G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.2020:213-229.
[24]YAN X,ZHENG C D,LI Z,et al.PointASNL:Robust point clouds processing using nonlocal neural networks with adaptive sampling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:5588-5597.
[25]AMIR H,RANA H,RAJA G,et al.PointGMM:A neural GMMnetwork for point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12051-12060.
[26]LU D N,XIE Q,WEI M Q.Transformers in 3d point clouds:A survey[EB/OL].https://arxiv.org/abs/2205.07417.
[27]GUO M H,CAI J X,LIU Z N,et al.Pct:Point cloud transfor-mer[J].Computational Visual Media,2021,7(2):187-199.
[28]YU X M,TANG L L,RAO Y M,et al.Point-BERT:Pre-trai-ning 3D Point Cloud Transformers with Masked Point Modeling[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:19291-19300.
[29]ZHAO H S,JIANG L,JIA J,et al.Point Transformer[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021:16239-16248.
[30]HUI L,YANG H,CHENG M M,et al.Pyramid Point CloudTransformer for Large-Scale Place Recognition[C]//Procee-dings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021:6078-6087.
[31]WANG Y,SUN Y B,LIU Z W,et al.Dynamic Graph CNN for Learning on Point Clouds[EB/OL].https://arxiv.org/abs/1801.07829.
[32]THOMAS H,QI C R,DESCHAUD J E,et al.KPConv:Flexible and Deformable Convolution for Point Clouds[EB/OL].https://arxiv.org/abs/1904.08889.
[33]CHANG A X,FUNKHOUSER T,GUIBAS L,et al.ShapeNet:An Information-Rich 3D Model Repository[EB/OL].https://arxiv.org/abs/1512.03012.

Related Articles 15

[1]	LIU Chunling, QI Xuyan, TANG Yonghe, SUN Xuekai, LI Qinghao, ZHANG Yu. Summary of Token-based Source Code Clone Detection Techniques [J]. Computer Science, 2024, 51(6): 12-22.
[2]	KONG Jialin, ZHANG Qi, WANG Caiyong. Review of Heterogeneous Iris Recognition [J]. Computer Science, 2024, 51(6): 186-197.
[3]	GAO Nan, ZHANG Lei, LIANG Ronghua, CHEN Peng, FU Zheng. Scene Text Detection Algorithm Based on Feature Enhancement [J]. Computer Science, 2024, 51(6): 256-263.
[4]	LIU Jiasen, HUANG Jun. Center Point Target Detection Algorithm Based on Improved Swin Transformer [J]. Computer Science, 2024, 51(6): 264-271.
[5]	JIANG Rui, YANG Kaihui, WANG Xiaoming, LI Dapeng, XU Youyun. Attentional Interaction-based Deep Learning Model for Chinese Question Answering [J]. Computer Science, 2024, 51(6): 325-330.
[6]	BAO Kainan, ZHANG Junbo, SONG Li, LI Tianrui. ST-WaveMLP:Spatio-Temporal Global-aware Network for Traffic Flow Prediction [J]. Computer Science, 2024, 51(5): 27-34.
[7]	ZHANG Jianliang, LI Yang, ZHU Qingshan, XUE Hongling, MA Junwei, ZHANG Lixia, BI Sheng. Substation Equipment Malfunction Alarm Algorithm Based on Dual-domain Sparse Transformer [J]. Computer Science, 2024, 51(5): 62-69.
[8]	HE Shiyang, WANG Zhaohui, GONG Shengrong, ZHONG Shan. Cross-modal Information Filtering-based Networks for Visual Question Answering [J]. Computer Science, 2024, 51(5): 85-91.
[9]	SONG Jianfeng, ZHANG Wenying, HAN Lu, HU Guozheng, MIAO Qiguang. Multi-stage Intelligent Color Restoration Algorithm for Black-and-White Movies [J]. Computer Science, 2024, 51(5): 92-99.
[10]	HE Xiaohui, ZHOU Tao, LI Panle, CHANG Jing, LI Jiamian. Study on Building Extraction from Remote Sensing Image Based on Multi-scale Attention [J]. Computer Science, 2024, 51(5): 134-142.
[11]	XU Xuejie, WANG Baohui. Multi-label Patent Classification Based on Text and Historical Data [J]. Computer Science, 2024, 51(5): 172-178.
[12]	LI Zichen, YI Xiuwen, CHEN Shun, ZHANG Junbo, LI Tianrui. Government Event Dispatch Approach Based on Deep Multi-view Network [J]. Computer Science, 2024, 51(5): 216-222.
[13]	HONG Tijing, LIU Dengfeng, LIU Yian. Radar Active Jamming Recognition Based on Multiscale Fully Convolutional Neural Network and GRU [J]. Computer Science, 2024, 51(5): 306-312.
[14]	SUN Jing, WANG Xiaoxia. Convolutional Neural Network Model Compression Method Based on Cloud Edge Collaborative Subclass Distillation [J]. Computer Science, 2024, 51(5): 313-320.
[15]	CHEN Runhuan, DAI Hua, ZHENG Guineng, LI Hui , YANG Geng. Urban Electricity Load Forecasting Method Based on Discrepancy Compensation and Short-termSampling Contrastive Loss [J]. Computer Science, 2024, 51(4): 158-164.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0