计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 225-230.doi: 10.11896/jsjkx.250700104
王义总1, 宁泓博2, 王昊峰2, 马思伟1, 高文1
WANG Yizong1, NING Hongbo2, WANG Haofeng2, MA Siwei1, GAO Wen1
摘要: 多视点视频能够为用户提供沉浸式体验并支持多种应用,但其传输带宽需求远高于传统视频。现有多视点编码算法主要利用二维视点间的冗余信息,未考虑三维空间冗余。为此,提出一种多视点视频流传输方法,将多视点视频转换为稀疏视点紧凑表示来降低三维空间冗余,并基于该表示在接收端进行三维重建,合成剩余视点。具体包括:1)提出一种基于稀疏视点的多视点视频紧凑表示,利用三维高斯重建与溅射合成剩余视点;2)设计视点选择方法,以优化合成视点的视觉质量。实验表明,提出的系统相比基线方法可降低至少44.6%的码率,同时支持端到端30 FPS以上的实时传输。
中图分类号:
| [1]ANTHES C,GARCÍA-HERNÁNDEZ R J,WIEDEMANN M,et al.State of the Art of Virtual Reality Technology[C]//Proceedings of IEEE Aerospace Conference.2016:1-19. [2]SCHMALSTIEG D,HÖLLERER T.Augmented Reality:Principles and Practice[C]//Proceedings of IEEE Virtual Reality.2017:425-426. [3]DAI A,CHANG A X,SAVVA M,et al.Scannet:Richly-annotated 3D Reconstructions of Indoor Scenes[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2017:2432-2443. [4]CHEN X,MA H,WAN J,et al.Multi-view 3D Object Detection Network for Autonomous Driving[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2017:6526-6534. [5]HOSSEINIAN S,AREFI H.3D Reconstruction from Multi-view Medical X-ray Images Review and Evaluation of Existing Methods[J].The International Archives of the Photogrammetry,Remote Sensing and Spatial Information Sciences,2015,XL-1/W5:319-326. [6]YU X,SANG X,CHEN D,et al.AutostereoscopicThree-dimensional Display with High Dense Views and the Narrow Structure Pitch[J].Chinese Optics Letters,2014,12(6):060008. [7]VETRO A,WIEGAND T,SULLIVAN G J.Overview of theStereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard[J].Proceedings of IEEE,2011,99(4):626-642. [8]HANNUKSELA M M,YAN Y,HUANG X,et al.Overview ofthe Multiview High Efficiency Video Coding(MV-HEVC) Standard[C]//Proceedings of IEEE International Conference on Image Processing(ICIP).2015:2154-2158. [9]WANG Y K,SKUPIN R,HANNUKSELA M M,et al.TheHigh-level Syntax of the Versatile Video Coding(VVC) Stan-dard[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,31(10):3779-3800. [10]BOYCE J M,DORÉ R,DZIEMBOWSKI A,et al.MPEG Immersive Video Coding Standard[J].Proceedings of the IEEE,2021,109(9):1521-1536. [11]VADAKITAL V K M,DZIEMBOWSKI A,LAFRUIT G,et al.TheMPEG Immersive Video Standard-Current Status and Future Outlook[J].IEEE Multimedia,2022,29(3):101-111. [12]DENG X,YANG W,YANG R,et al.DeepHomography for Efficient Stereo Image compression[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2021:1492-1501. [13]LEI J,LIU X,PENG B,et al.DeepStereo Image Compression via Bi-directional coding[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2022:19669-19678. [14]ZHANG X,SHAO J,ZHANG J.Ldmic:Learning-based Distributed Multi-view Image Coding[C]//Proceedings of International Conference on Learning Representations.2023. [15]LIU Z,ZHANG X,SHAO J,et al.Bidirectional Stereo ImageCompression with Cross-dimensional Entropy Model[C]//Proceedings of European Conference on Computer Vision.2024. [16]HUANG Y,CHEN B,LIAN N,et al.3D-GP-LMVIC:Learning-based Multi-view Image Coding with 3D Gaussian Geometric Priors[J].arXiv:2409.04013,2024. [17]KERBL B,KOPANAS G,LEIMKUEHLER T,et al.3D Gaus-sian Splatting for Real-time Radiance Field Rendering[J].ACM Transactions on Graphics,2023,42(4):1-14. [18]WÖDLINGER M,KOTERA J,XU J,et al.Sasic:Stereo Image Compression with Latent Shifts and Stereo Attention[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2022:661-670. [19]ZHAI Y,TANG L,MA Y,et al.Disparity-based Stereo Image Compression with Aligned Cross-view Priors[C]//Proceedings of ACM International Conference on Multimedia.2022:2351-2360. [20]DENG X,DENG Y,YANG R,et al.Masic:DeepMask StereoImage Compression[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(10):6062-6040. [21]AYZIK S,AVIDAN S.Deep Image Compression Using Decoder Side Information[C]//Proceedings European Conference on Computer Vision.2020:699-714. [22]HUANG Y,CHEN B,QIN S,et al.Learned Distributed ImageCompression with Multi-scale Patch Matching in Feature Domain[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:4322-4329. [23]HAMDI A,MELAS-KYRIAZI L,MAI J,et al.Ges:Genera-lized Exponential Splatting for Efficient Radiance Field Rendering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2024:19812-19822. [24]JIANG Y H,SHEN Z H,HONG Y,et al.Robust Dual Gaussian Splatting for Immersive Human-centric Volumetric Videos[J].ACM Transactions on Graphics,2024.43(6):1-15. [25]LIU W,BAO Q,SUN Y,et al.Recent Advances of Monocular 2D and 3D Human Pose Estimation:A Deep Learning Perspective[J].ACM Computing Surveys,2023,55(4):1-41. [26]ZHENG S,ZHOU B,SHAO R,et al.GPS-Gaussian:Generalizable Pixel-wise 3d Gaussian Splatting for Real-time Human Novel View Synthesis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE,2024:19680-19690. [27]D’EON E,HARRISON B,MYERS T,et al.8iVoxelized Full Bodies-a Voxelized Point Cloud Dataset[EB/OL].http://plenodb.jpeg.org/pc/8ilabs/. [28]LOOP C,CAI Q,ESCOLANO S O,et al.Microsoft VoxelizedUpper Bodies-a Voxelized Point Cloud Dataset[EB/OL].http://plenodb.jpeg.org/pc/microsoft/. [29]WANG Z,BOVIK A,SHEIKH H,et al.ImageQuality Assessment:from Error Visibility to Structural Similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612. [30]LOSHCHILOV I,HUTTER F.DecoupledWeight Decay Regularization[C]//Proceedigns of International Conference on Learning Representations.2019. |
|
||