计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 127-131.doi: 10.11896/jsjkx.211100179

• 计算机图形学&多媒体 • 上一篇    下一篇

基于卷积神经网络的虚拟现实视频帧内预测编码

刘月红1, 牛少华2, 神显豪1   

  1. 1 桂林理工大学信息科学与工程学院 广西 桂林541004
    2 北京理工大学机电学院 北京100081
  • 收稿日期:2021-11-17 修回日期:2022-03-15 出版日期:2022-07-15 发布日期:2022-07-12
  • 通讯作者: 神显豪(lyj_sxh@sina.com)
  • 作者简介:(9351747@qq.com)
  • 基金资助:
    国家自然科学基金(61961010);广西自然科学基金(2018GXNSFBA050029,2020GXNSFAA297255);广西科技重大专项(桂科AA19046004)

Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network

LIU Yue-hong1, NIU Shao-hua2, SHEN Xian-hao1   

  1. 1 College of Information Science and Engineering,Guilin University of Technology,Guilin,Guangxi 541004,China
    2.School of Mechanical and Electrical Engineering,Beijing Institute of Technology,Beijing 100081,China
  • Received:2021-11-17 Revised:2022-03-15 Online:2022-07-15 Published:2022-07-12
  • About author:LIU Yue-hong,born in 1980,master.Her main research interests include fiber optic communication and intelligent hardware and virtual reality.
    SHEN Xian-hao,born in 1980,Ph.D,professor.His main research interests include deep learning and virtual testing.
  • Supported by:
    National Natural Science Foundation of China(61961010),Science Foundation of Guangxi Province(2018GXNSFBA050029,2020GXNSFAA297255) and Guangxi Science and Technology Major Special Project(Gui Ke AA19046004).

摘要: 为了提高虚拟现实视频帧内预测编码的性能,采用卷积神经网络算法进行视频帧编码单元(Coding Unit,CU)选择,从而降低视频图像编码复杂度。首先设置量化参数,获取虚拟现实视频帧样本,接着构建图像编码树,然后建立卷积神经网络(Convolutional Neural Network,CNN)帧编码单元优化模型,将帧样本的图像亮度作为CNN的输入,结合图像率失真成本阈值,通过训练获得帧编码单元的优化结果。采用CNN训练优化,能够根据图像不同纹理度模块的帧内编码需求,获得不同深度的编码树(Coding Tree Unit,CTU)结构及合适数量的CU模块。实验结果表明,通过合理设置卷积核尺寸和量化参数,相比常用视频帧内预测编码算法,CNN算法能够获得更优的图像质量,在Balboa序列中的BD码率和编码时间分别为56 483.76 kbps和3 209.24 s。

关键词: 编码单元, 卷积核尺寸, 卷积神经网络, 虚拟现实, 帧内编码

Abstract: In order to improve the performance of virtual reality video intraframe prediction coding,convolutional neural network algorithm is used to select video frame coding unit(CU) to reduce the complexity of video image coding.Firstly,quantization parameters are set to obtain the virtual reality video frame samples,then the image coding tree is constructed,and the convolutional neural network (CNN) frame coding unit optimization model is established.The image brightness of frame samples is taken as the CNN input,combined with the image rate distortion cost threshold,the optimization results of the frame coding unit are obtained through training.Using CNN training optimization,the coding tree(CTU) structure with different depths and an appro-priate number of CU modules can be obtained according to the intraframe coding requirements of different texture modules of the image.Experiments show that,by reasonably setting the convolution kernel size and quantization parameters,CNN algorithm can obtain better image quality and less coding time than common video intraframe prediction coding algorithms.

Key words: Coding unit, Convolution kernel size, Convolutional neural network, Intraframe coding, Virtual reality

中图分类号: 

  • TP317.4
[1]MA S Q,CHE X P,YU Q,et al.Research on event-based vir-tual reality user experience evaluation method [J].Computer Science,2021,48(2):167-174.
[2]ZHU W,YI Y,WANG T Q,et al.A fast division algorithm of depth image intra coding unit [J].Computer Science,2019,46(10):286-294.
[3]YI Q M,XIE Z H,SHI M.a fast decision combination algorithm for hevc intra coding [J].Small Microcomputer System,2019,40(1):199-204.
[4]JIN Z P,AN P,YANG C,et al.Post-processing for intra coding through perceptual adversarial learning and progressive refinement[J].Neurocomputing,2020,394:158-167.
[5]ZHANG R,JIA K,LIU P,et al.Fast intra-mode decision for depth map coding in 3D-HEVC[J].Journal of Real-Time Image Processing,2020,17(5):1637-1646.
[6]JIANG X,XU Q,SUN T,et al.Detection of HEVC DoubleCompression with the Same Coding Parameters Based on Analysis of Intra Coding Quality Degradation Process[J].IEEE Transactions on Information Forensics and Security,2020,15:250-263.
[7]PARASCHIV E G,RUIZ-COLL D,PANTOJA M,et al.Parallelization and improvement of the MDV-SW algorithm for HEVC intra-prediction coding[J].Journal of Supercomputing,2019,75(3):1150-1162.
[8]TAI K H,CHEN M J,LIN J R,et al.Acceleration for HEVC Encoder by Bimodal Segmentation of Rate-Distortion Cost and Accurate Determination of Early Termination and Early Split[J].IEEE Access,2019,7:45259-45273.
[9]SHARMA A K,CHAURASIA S,SRIVASTAVA D K.Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec[J].Procedia Computer Science,2020,167:1139-1147.
[10]ZHOU C,ZHOU J,CAI Y U,et al.Multi-channel Sliced Deep RCNN with Residual Network for Text Classification[J].Chinese Journal of Electronics,2020,29(5):92-98.
[11]LIU Y,SUEN C Y,LIU Y,et al.Scene Classification Using Hierarchical Wasserstein CNN[J].IEEE Transactions on Geo-science and Remote Sensing,2019,57(5):2494-2509.
[12]GUO B,ZHANG C,LIU J,et al.Improving text classification with weighted word embeddings via a multi-channel TextCNN model[J].Neurocomputing,2019,363(21):366-374.
[13]HUANG S,SI PT,ZHANG Q Y,et al.Fast intra coding algorithm of hevc SCC based on decision tree [J].Journal of Opto-electronics·Laser,2019,30(4):420-427.
[14]REN Y,PENG Z J,CUI X,et al.Fast division of FVC intra co-ding units combined with random forest [J].Chinese Journal of Image and Graphics,2019,24(5):724-733.
[15]ZHAO J,WANG Y,ZHANG Q.Adaptive CU Split DecisionBased on Deep Learning and Multifeature Fusion for H.266/VVC[J].Scientific Programming,2020,2020:1-11.
[1] 曲倩文, 车啸平, 曲晨鑫, 李瑾如.
基于信息感知的虚拟现实用户临场感研究
Study on Information Perception Based User Presence in Virtual Reality
计算机科学, 2022, 49(9): 146-154. https://doi.org/10.11896/jsjkx.220500200
[2] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[3] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[4] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[5] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[6] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[7] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[8] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[9] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[10] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[11] 杨涵, 万游, 蔡洁萱, 方铭宇, 吴卓超, 金扬, 钱伟行.
基于步态分类辅助的虚拟IMU的行人导航方法
Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification
计算机科学, 2022, 49(6A): 759-763. https://doi.org/10.11896/jsjkx.211200148
[12] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
[13] 吴子斌, 闫巧.
基于动量的映射式梯度下降算法
Projected Gradient Descent Algorithm with Momentum
计算机科学, 2022, 49(6A): 178-183. https://doi.org/10.11896/jsjkx.210500039
[14] 王杉, 徐楚怡, 师春香, 张瑛.
基于CNN-LSTM的卫星云图云分类方法研究
Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM
计算机科学, 2022, 49(6A): 675-679. https://doi.org/10.11896/jsjkx.210300177
[15] 杨玥, 冯涛, 梁虹, 杨扬.
融合交叉注意力机制的图像任意风格迁移
Image Arbitrary Style Transfer via Criss-cross Attention
计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!