基于三维高斯溅射的低码率实时多视点视频流传输

doi:10.11896/jsjkx.250700104

Abstract

Abstract: Multiview videos can offer viewers immersive experiences and enable a variety of applications,but they require times of transmission bandwidth compared to traditional videos.Current multiview coding algorithms mainly leverage redundancy between 2D views and do not consider 3D spatial redundancy.This paper presents a multiview video streaming approach that transforms multiview video content into a compact sparse-view representation to reduce redundancy in 3D space.At the receiver side,the remaining views are synthesized through 3D reconstruction based on this representation.Specifically,this paper proposes a compact multiview video representation based on sparse-views,where the remaining views are synthesized using 3D Gaussian reconstruction and splatting,and a view selection method that selects views to optimize visual quality of synthesized views.Experiments show that the proposed method achieves at least a 44.6% bitrate reduction compared with the baseline and supports end-to-end streaming at over 30 FPS.

Key words: Video streaming, Multiview video, 3D Gaussian splatting, Immersive video, 3D reconstruction

CLC Number:

TP391

WANG Yizong, NING Hongbo, WANG Haofeng, MA Siwei, GAO Wen. Low-bitrate and Real-time Multiview Video Streaming with 3D Gaussian Splatting[J].Computer Science, 2026, 53(3): 225-230.

References

[1]ANTHES C,GARCÍA-HERNÁNDEZ R J,WIEDEMANN M,et al.State of the Art of Virtual Reality Technology[C]//Proceedings of IEEE Aerospace Conference.2016:1-19.
[2]SCHMALSTIEG D,HÖLLERER T.Augmented Reality:Principles and Practice[C]//Proceedings of IEEE Virtual Reality.2017:425-426.
[3]DAI A,CHANG A X,SAVVA M,et al.Scannet:Richly-annotated 3D Reconstructions of Indoor Scenes[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2017:2432-2443.
[4]CHEN X,MA H,WAN J,et al.Multi-view 3D Object Detection Network for Autonomous Driving[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2017:6526-6534.
[5]HOSSEINIAN S,AREFI H.3D Reconstruction from Multi-view Medical X-ray Images Review and Evaluation of Existing Methods[J].The International Archives of the Photogrammetry,Remote Sensing and Spatial Information Sciences,2015,XL-1/W5:319-326.
[6]YU X,SANG X,CHEN D,et al.AutostereoscopicThree-dimensional Display with High Dense Views and the Narrow Structure Pitch[J].Chinese Optics Letters,2014,12(6):060008.
[7]VETRO A,WIEGAND T,SULLIVAN G J.Overview of theStereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard[J].Proceedings of IEEE,2011,99(4):626-642.
[8]HANNUKSELA M M,YAN Y,HUANG X,et al.Overview ofthe Multiview High Efficiency Video Coding(MV-HEVC) Standard[C]//Proceedings of IEEE International Conference on Image Processing(ICIP).2015:2154-2158.
[9]WANG Y K,SKUPIN R,HANNUKSELA M M,et al.TheHigh-level Syntax of the Versatile Video Coding(VVC) Stan-dard[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31(10):3779-3800.
[10]BOYCE J M,DORÉ R,DZIEMBOWSKI A,et al.MPEG Immersive Video Coding Standard[J].Proceedings of the IEEE,2021,109(9):1521-1536.
[11]VADAKITAL V K M,DZIEMBOWSKI A,LAFRUIT G,et al.TheMPEG Immersive Video Standard－Current Status and Future Outlook[J].IEEE Multimedia,2022,29(3):101-111.
[12]DENG X,YANG W,YANG R,et al.DeepHomography for Efficient Stereo Image compression[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2021:1492-1501.
[13]LEI J,LIU X,PENG B,et al.DeepStereo Image Compression via Bi-directional coding[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2022:19669-19678.
[14]ZHANG X,SHAO J,ZHANG J.Ldmic:Learning-based Distributed Multi-view Image Coding[C]//Proceedings of International Conference on Learning Representations.2023.
[15]LIU Z,ZHANG X,SHAO J,et al.Bidirectional Stereo ImageCompression with Cross-dimensional Entropy Model[C]//Proceedings of European Conference on Computer Vision.2024.
[16]HUANG Y,CHEN B,LIAN N,et al.3D-GP-LMVIC:Learning-based Multi-view Image Coding with 3D Gaussian Geometric Priors[J].arXiv:2409.04013,2024.
[17]KERBL B,KOPANAS G,LEIMKUEHLER T,et al.3D Gaus-sian Splatting for Real-time Radiance Field Rendering[J].ACM Transactions on Graphics,2023,42(4):1-14.
[18]WÖDLINGER M,KOTERA J,XU J,et al.Sasic:Stereo Image Compression with Latent Shifts and Stereo Attention[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2022:661-670.
[19]ZHAI Y,TANG L,MA Y,et al.Disparity-based Stereo Image Compression with Aligned Cross-view Priors[C]//Proceedings of ACM International Conference on Multimedia.2022:2351-2360.
[20]DENG X,DENG Y,YANG R,et al.Masic:DeepMask StereoImage Compression[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(10):6062-6040.
[21]AYZIK S,AVIDAN S.Deep Image Compression Using Decoder Side Information[C]//Proceedings European Conference on Computer Vision.2020:699-714.
[22]HUANG Y,CHEN B,QIN S,et al.Learned Distributed ImageCompression with Multi-scale Patch Matching in Feature Domain[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:4322-4329.
[23]HAMDI A,MELAS-KYRIAZI L,MAI J,et al.Ges:Genera-lized Exponential Splatting for Efficient Radiance Field Rendering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2024:19812-19822.
[24]JIANG Y H,SHEN Z H,HONG Y,et al.Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos[J].ACM Transactions on Graphics,2024.43(6):1-15.
[25]LIU W,BAO Q,SUN Y,et al.Recent Advances of Monocular 2D and 3D Human Pose Estimation:A Deep Learning Perspective[J].ACM Computing Surveys,2023,55(4):1-41.
[26]ZHENG S,ZHOU B,SHAO R,et al.GPS-Gaussian:Generalizable Pixel-wise 3d Gaussian Splatting for Real-time Human Novel View Synthesis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2024:19680-19690.
[27]D’EON E,HARRISON B,MYERS T,et al.8iVoxelized Full Bodies-a Voxelized Point Cloud Dataset[EB/OL].http://plenodb.jpeg.org/pc/8ilabs/.
[28]LOOP C,CAI Q,ESCOLANO S O,et al.Microsoft VoxelizedUpper Bodies-a Voxelized Point Cloud Dataset[EB/OL].http://plenodb.jpeg.org/pc/microsoft/.
[29]WANG Z,BOVIK A,SHEIKH H,et al.ImageQuality Assessment:from Error Visibility to Structural Similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612.
[30]LOSHCHILOV I,HUTTER F.DecoupledWeight Decay Regularization[C]//Proceedigns of International Conference on Learning Representations.2019.

Related Articles 15

[1]	SONG Xingnuo, WANG Congyan, CHEN Mingkai. Survey on 3D Scene Reconstruction Techniques in Metaverse [J]. Computer Science, 2025, 52(3): 17-32.
[2]	WANG Jie, WANG Chuangye, XIE Jiucheng, GAO Hao. Animatable Head Avatar Reconstruction Algorithm Based on Region Encoding [J]. Computer Science, 2025, 52(3): 50-57.
[3]	WANG Xingbo, ZHANG Hao, GAO Hao, ZHAI Mingliang, XIE Jiucheng. Talking Portrait Synthesis Method Based on Regional Saliency and Spatial Feature Extraction [J]. Computer Science, 2025, 52(3): 58-67.
[4]	ZHONG Yue, GU Jieming. 3D Reconstruction of Single-view Sketches Based on Attention Mechanism and Contrastive Loss [J]. Computer Science, 2025, 52(3): 77-85.
[5]	CAO Mingwei, HUANG Baolong, ZHAO Haifeng. Appearance Enhancement and Semantic Segmentation-based Neural Radiance Fields [J]. Computer Science, 2025, 52(12): 141-149.
[6]	LI Pengfei, GUAN Xiancai, ZHU Youjian, LI Yuanqiao, WANG Jun. Optimization and Absolute Scale Recovery of SFM Algorithm in GCP-assisted Colmap Framework [J]. Computer Science, 2025, 52(11A): 250100015-6.
[7]	YE Ruiwen, WANG Baohui. 3D Reconstruction Algorithm for Lower Limb X-ray Images Based on Generative AdversarialNetworks [J]. Computer Science, 2024, 51(11A): 230900089-7.
[8]	HE Weilong, SU Lingli, GUO Bingxuan, LI Maosen, HAO Yan. Research and Implementation of Dynamic Scene 3D Perception Technology Based on BinocularEstimation [J]. Computer Science, 2024, 51(11A): 240300045-8.
[9]	ZHANG Xiao, DONG Hongbin. Lightweight Multi-view Stereo Integrating Coarse Cost Volume and Bilateral Grid [J]. Computer Science, 2023, 50(8): 125-132.
[10]	FENG Lei, ZHU Deng-ming, LI Zhao-xin, WANG Zhao-qi. Sparse Point Cloud Filtering Algorithm Based on Mask [J]. Computer Science, 2022, 49(5): 25-32.
[11]	PENG Dong-yang, WANG Rui, HU Gu-yu, ZU Jia-chen, WANG Tian-feng. Fair Joint Optimization of QoE and Energy Efficiency in Caching Strategy for Videos [J]. Computer Science, 2022, 49(4): 312-320.
[12]	REN Fei, CHANG Qing-ling, LIU Xing-lin, YANG Xin, LI Ming-hua, CUI Yan. Overview of 3D Reconstruction of Indoor Structures Based on Point Clouds [J]. Computer Science, 2022, 49(11A): 211000176-11.
[13]	MA Jun-cheng, JIANG Mu-rong, FANG Su-qin. Three-dimensional Reconstruction of Cone Meteorological Data Based on Improved MarchingTetrahedra Algorithm [J]. Computer Science, 2021, 48(11A): 644-647.
[14]	ZENG Jun-fei,YANG Hai-qing,WU Hao. Adaptive Levenberg-Marquardt Cloud Registration Method for 3D Reconstruction [J]. Computer Science, 2020, 47(3): 137-142.
[15]	MIAO Hui-cui, WANG Ji-hua, ZHANG Quan-ying. Concave-convex Manufacturing Features Recognition Based on 3D Reconstruction of Single View [J]. Computer Science, 2019, 46(7): 280-285.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Low-bitrate and Real-time Multiview Video Streaming with 3D Gaussian Splatting

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0