基于三维高斯溅射的低码率实时多视点视频流传输

doi:10.11896/jsjkx.250700104

计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 225-230.doi: 10.11896/jsjkx.250700104

• 计算机图形学 & 多媒体 • 上一篇下一篇

基于三维高斯溅射的低码率实时多视点视频流传输

王义总¹, 宁泓博², 王昊峰², 马思伟¹, 高文¹

1 北京大学计算机学院北京 100871
2 北京大学信息工程学院广东深圳 518055

收稿日期:2025-07-16 修回日期:2025-10-18 发布日期:2026-03-12
通讯作者: 马思伟(swma@pku.edu.cn)
作者简介:(wang@pku.edu.cn)
基金资助:
北京市自然科学基金(L242014);鹏城实验室科教基金会-中国移动科创专项(2024ZY1C0040);“国家资助博士后研究人员计划”和“中国博士后科学基金”(BX20250382)

Low-bitrate and Real-time Multiview Video Streaming with 3D Gaussian Splatting

WANG Yizong¹, NING Hongbo², WANG Haofeng², MA Siwei¹, GAO Wen¹

1 School of Computer Science, Peking University, Beijing 100871, China
2 School of Electronic and Computer Engineering, Peking University, Shenzhen, Guangdong 518055, China

Received:2025-07-16 Revised:2025-10-18 Online:2026-03-12
About author:WANG Yizong,born in 1997,Ph.D,is a member of CCF(No.B7846M).His main research interest is immersive vi-deo streaming.
MA Siwei,born in 1979,professor,Ph.D supervisor.His main research interest is video coding.
Supported by:
Beijing Natural Science Foundation(L242014),PCL-CMCC Foundation for Science and Innovation (2024ZY1C0040) and Postdoctoral Fellowship Program and China Postdoctoral Science Foundation(BX20250382).

摘要/Abstract

摘要： 多视点视频能够为用户提供沉浸式体验并支持多种应用,但其传输带宽需求远高于传统视频。现有多视点编码算法主要利用二维视点间的冗余信息,未考虑三维空间冗余。为此,提出一种多视点视频流传输方法,将多视点视频转换为稀疏视点紧凑表示来降低三维空间冗余,并基于该表示在接收端进行三维重建,合成剩余视点。具体包括:1)提出一种基于稀疏视点的多视点视频紧凑表示,利用三维高斯重建与溅射合成剩余视点;2)设计视点选择方法,以优化合成视点的视觉质量。实验表明,提出的系统相比基线方法可降低至少44.6%的码率,同时支持端到端30 FPS以上的实时传输。

关键词: 视频流传输, 多视点视频, 三维高斯溅射, 沉浸式视频, 三维重建

Abstract: Multiview videos can offer viewers immersive experiences and enable a variety of applications,but they require times of transmission bandwidth compared to traditional videos.Current multiview coding algorithms mainly leverage redundancy between 2D views and do not consider 3D spatial redundancy.This paper presents a multiview video streaming approach that transforms multiview video content into a compact sparse-view representation to reduce redundancy in 3D space.At the receiver side,the remaining views are synthesized through 3D reconstruction based on this representation.Specifically,this paper proposes a compact multiview video representation based on sparse-views,where the remaining views are synthesized using 3D Gaussian reconstruction and splatting,and a view selection method that selects views to optimize visual quality of synthesized views.Experiments show that the proposed method achieves at least a 44.6% bitrate reduction compared with the baseline and supports end-to-end streaming at over 30 FPS.

Key words: Video streaming, Multiview video, 3D Gaussian splatting, Immersive video, 3D reconstruction

中图分类号:

TP391

王义总, 宁泓博, 王昊峰, 马思伟, 高文. 基于三维高斯溅射的低码率实时多视点视频流传输[J]. 计算机科学, 2026, 53(3): 225-230. https://doi.org/10.11896/jsjkx.250700104

WANG Yizong, NING Hongbo, WANG Haofeng, MA Siwei, GAO Wen. Low-bitrate and Real-time Multiview Video Streaming with 3D Gaussian Splatting[J]. Computer Science, 2026, 53(3): 225-230. https://doi.org/10.11896/jsjkx.250700104

参考文献

[1]ANTHES C,GARCÍA-HERNÁNDEZ R J,WIEDEMANN M,et al.State of the Art of Virtual Reality Technology[C]//Proceedings of IEEE Aerospace Conference.2016:1-19.
[2]SCHMALSTIEG D,HÖLLERER T.Augmented Reality:Principles and Practice[C]//Proceedings of IEEE Virtual Reality.2017:425-426.
[3]DAI A,CHANG A X,SAVVA M,et al.Scannet:Richly-annotated 3D Reconstructions of Indoor Scenes[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2017:2432-2443.
[4]CHEN X,MA H,WAN J,et al.Multi-view 3D Object Detection Network for Autonomous Driving[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2017:6526-6534.
[5]HOSSEINIAN S,AREFI H.3D Reconstruction from Multi-view Medical X-ray Images Review and Evaluation of Existing Methods[J].The International Archives of the Photogrammetry,Remote Sensing and Spatial Information Sciences,2015,XL-1/W5:319-326.
[6]YU X,SANG X,CHEN D,et al.AutostereoscopicThree-dimensional Display with High Dense Views and the Narrow Structure Pitch[J].Chinese Optics Letters,2014,12(6):060008.
[7]VETRO A,WIEGAND T,SULLIVAN G J.Overview of theStereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard[J].Proceedings of IEEE,2011,99(4):626-642.
[8]HANNUKSELA M M,YAN Y,HUANG X,et al.Overview ofthe Multiview High Efficiency Video Coding(MV-HEVC) Standard[C]//Proceedings of IEEE International Conference on Image Processing(ICIP).2015:2154-2158.
[9]WANG Y K,SKUPIN R,HANNUKSELA M M,et al.TheHigh-level Syntax of the Versatile Video Coding(VVC) Stan-dard[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31(10):3779-3800.
[10]BOYCE J M,DORÉ R,DZIEMBOWSKI A,et al.MPEG Immersive Video Coding Standard[J].Proceedings of the IEEE,2021,109(9):1521-1536.
[11]VADAKITAL V K M,DZIEMBOWSKI A,LAFRUIT G,et al.TheMPEG Immersive Video Standard－Current Status and Future Outlook[J].IEEE Multimedia,2022,29(3):101-111.
[12]DENG X,YANG W,YANG R,et al.DeepHomography for Efficient Stereo Image compression[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2021:1492-1501.
[13]LEI J,LIU X,PENG B,et al.DeepStereo Image Compression via Bi-directional coding[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2022:19669-19678.
[14]ZHANG X,SHAO J,ZHANG J.Ldmic:Learning-based Distributed Multi-view Image Coding[C]//Proceedings of International Conference on Learning Representations.2023.
[15]LIU Z,ZHANG X,SHAO J,et al.Bidirectional Stereo ImageCompression with Cross-dimensional Entropy Model[C]//Proceedings of European Conference on Computer Vision.2024.
[16]HUANG Y,CHEN B,LIAN N,et al.3D-GP-LMVIC:Learning-based Multi-view Image Coding with 3D Gaussian Geometric Priors[J].arXiv:2409.04013,2024.
[17]KERBL B,KOPANAS G,LEIMKUEHLER T,et al.3D Gaus-sian Splatting for Real-time Radiance Field Rendering[J].ACM Transactions on Graphics,2023,42(4):1-14.
[18]WÖDLINGER M,KOTERA J,XU J,et al.Sasic:Stereo Image Compression with Latent Shifts and Stereo Attention[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2022:661-670.
[19]ZHAI Y,TANG L,MA Y,et al.Disparity-based Stereo Image Compression with Aligned Cross-view Priors[C]//Proceedings of ACM International Conference on Multimedia.2022:2351-2360.
[20]DENG X,DENG Y,YANG R,et al.Masic:DeepMask StereoImage Compression[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(10):6062-6040.
[21]AYZIK S,AVIDAN S.Deep Image Compression Using Decoder Side Information[C]//Proceedings European Conference on Computer Vision.2020:699-714.
[22]HUANG Y,CHEN B,QIN S,et al.Learned Distributed ImageCompression with Multi-scale Patch Matching in Feature Domain[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:4322-4329.
[23]HAMDI A,MELAS-KYRIAZI L,MAI J,et al.Ges:Genera-lized Exponential Splatting for Efficient Radiance Field Rendering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2024:19812-19822.
[24]JIANG Y H,SHEN Z H,HONG Y,et al.Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos[J].ACM Transactions on Graphics,2024.43(6):1-15.
[25]LIU W,BAO Q,SUN Y,et al.Recent Advances of Monocular 2D and 3D Human Pose Estimation:A Deep Learning Perspective[J].ACM Computing Surveys,2023,55(4):1-41.
[26]ZHENG S,ZHOU B,SHAO R,et al.GPS-Gaussian:Generalizable Pixel-wise 3d Gaussian Splatting for Real-time Human Novel View Synthesis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2024:19680-19690.
[27]D’EON E,HARRISON B,MYERS T,et al.8iVoxelized Full Bodies-a Voxelized Point Cloud Dataset[EB/OL].http://plenodb.jpeg.org/pc/8ilabs/.
[28]LOOP C,CAI Q,ESCOLANO S O,et al.Microsoft VoxelizedUpper Bodies-a Voxelized Point Cloud Dataset[EB/OL].http://plenodb.jpeg.org/pc/microsoft/.
[29]WANG Z,BOVIK A,SHEIKH H,et al.ImageQuality Assessment:from Error Visibility to Structural Similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612.
[30]LOSHCHILOV I,HUTTER F.DecoupledWeight Decay Regularization[C]//Proceedigns of International Conference on Learning Representations.2019.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于三维高斯溅射的低码率实时多视点视频流传输

Low-bitrate and Real-time Multiview Video Streaming with 3D Gaussian Splatting

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0