Computer Science ›› 2019, Vol. 46 ›› Issue (3): 113-118.doi: 10.11896/j.issn.1002-137X.2019.03.016

• ChinaMM2018 • Previous Articles     Next Articles

Deep Learning Based Fast VideoTranscoding Algorithm

XU Jing-yao, WANG Zu-lin, XU Mai   

  1. (College of Electronics and Information Engineering,Beihang University,Beijing 100191,China)
  • Received:2018-07-11 Revised:2018-09-15 Online:2019-03-15 Published:2019-03-22

Abstract: Due to the good rate-distortion performance,as the latest video compression standard,high efficiency video coding (HEVC) has been adopted by more and more terminals.However,there are still a large number of H.264 streams in the field of video compression.Therefore,H.264 to HEVC video transcoding is a meaningful research issue.The simplest way to achieve H.264 to HEVC transcoding is to directly cascade the H.264 decoder and the HEVC encoder.Due to high complexity of the HEVC coding process,this transcoding method is time-consuming.Therefore,this paper proposed a fast H.264 to HEVC transcoding method based on deep learning to predict the CTU(Coding Tree Unit) partition of HEVC,avoiding the brute-force search of CTU partition for rate-distortion optimization(RDO).First,a large-scale database of H.264 to HEVC transcoding is built for ensuring the training of deep learning model.Second,the correlation between HEVC CTU partition and H.264 domain features is analyzed,and the similarity of CTU partition across frames is found out.Then,a three-level classifier based on LSTM (Long Short-Term Memory) is designed to predict the CTU partition.The experimental results show that the H.264 to HEVC fast transcoding algorithm proposed in this paper achieves 60% reduction in complexity compared to the original transcoder,while the peak signal-to-noise ratio is only reduced by 0.039kdB,so the proposed method outperforms the state-of-the-art transcoding methods.

Key words: Deep learning, H.264, HEVC, Video transcoding

CLC Number: 

  • TN919.81
[1]LIU Z,YU X,GAO Y,et al.CU Partition Mode Decision for HEVC Hardwired Intra Encoder Using Convolution Neural Network[J].IEEE Transactions on Image Processing,2016,25(11):5088-5103.
[2]SHEN L,LIU Z,ZHANG X,et al.An Effective CU Size Decision Method for HEVC Encoders[J].IEEE Transactions on Multimedia,2013,15(2):465-470.
[3]ZHANG D,TONG J,ZAND D.Fast CU partition for H.264/AVC to HEVC transcoding based on fisher discriminant analysis[C]∥Visual Communications and Image Processing.IEEE,2017:1-4.
[4]PEIXOTO E,IZQUIERDO E.A complexity-scalable transcoder from H.264/AVC to the new HEVC codec[C]∥IEEE International Conference on Image Processing.IEEE,2012:737-740.
[5]NAGARAGHATTA A,ZHAO Y,MAXWELL G,et al.Fast
H.264/AVC to HEVC transcoding using mode merging and mode mapping[C]∥IEEE International Conference on ConsumerElectronics.Berlin:IEEE,2016:165-169.
[6]FRANCHE J F,COULOMBE S.Fast H.264 to HEVC
transcoder based on post-order traversal of quadtree structure[C]∥IEEE International Conference on Image Processing.IEEE,2015:477-481.
[7]PEIXOTO E,MACCHIAVELLO B,HUNG E M,et al.An H.264/AVC to HEVC video transcoder based on mode mapping[C]∥IEEE International Conference on Image Processing.IEEE,2014:1972-1976.
[8]PEIXOTOE,SHANABLEH T,IZQUIERDOE.H.264/AVC to HEVC Video Transcoder Based on Dynamic Thresholding and Content Modeling[J].IEEE Transactions on Circuits & Systems for Video Technology,2014,24(1):99-112.
[9]PEIXOTO E,MACCHIAVELLO B,QUEIROZ R L D,et al.Fast H.264/AVC to HEVC transcoding based on machine learning[C]∥Telecommunications Symposium.IEEE,2014:1-4.
[10]JIANG W,CHEN Y,TIAN X.Fast transcoding from H.264 to HEVC based on region feature analysis[J].Multimedia Tools & Applications,2014,73(3):2179-2200.
[11]DAZ-HONRUBIA A J,MARTNEZ J L,PUERTA J M,et al.Fast quadtree level decision algorithm for H.264/HEVC transcoder[C]∥IEEE International Conference on Image Processing.IEEE,2015:2497-2501.
[12]DAZ-HONRUBIA A J,MARTNEZ J L,CUENCA P,et al.
Adaptive Fast Quadtree Level Decision Algorithm for H.264 to HEVC Video Transcoding[J].IEEE Transactions on Circuits & Systems for Video Technology,2016,26(1):154-168.
[13]CORREA G,AGOSTINI L,CRUZ L A D S.Fast H.264/AVC to HEVC transcoder based on data mining and decision trees[C]∥IEEE International Symposium on Circuits and Systems.IEEE,2016:2539-2542.
[14]ZHU L,ZHANG Y,LI N,et al.Machine learning based fast H.
264/AVC to HEVC transcoding exploiting block partition similarity[J].Journal of Visual Communication & Image Representation,2016,38(C):824-837.
[15]Xiph.org.Xiph.org video test media[OL].https://media.xiph.org/video/derf/.
[16]XU M,DENG X,LI S,et al.Region-of-Interest Based Conversational HEVC Coding with Hierarchical Perception Model of Face[J].IEEE Journal of Selected Topics in Signal Processing,2014,8(3):475-489.
[17]OHM J R,SULLIVAN G J,TAN T K,et al.Comparison of the
Coding Efficiency of Video Coding Standards—Including High Efficiency Video Coding (HEVC)[J].IEEE Transactions on Circuits & Systems for Video Technology,2012,22(12):1669-1684.
[18]INGMA D P,BA J.Adam:A Method for Stochastic Optimization[J].arXiv preprint arXiv:141206980,2014.
[19]CORREA G,ASSUNCAO P A,AGOSTINI L V,et al.Fast
HEVC Encoding Decisions Using Data Mining[J].IEEE Tran-sactions on Circuits & Systems for Video Technology,2015,25(4):660-673.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[3] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[9] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[10] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[11] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[12] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[13] ZHU Wen-tao, LAN Xian-chao, LUO Huan-lin, YUE Bing, WANG Yang. Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN [J]. Computer Science, 2022, 49(6A): 378-383.
[14] WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[15] MAO Dian-hui, HUANG Hui-yu, ZHAO Shuang. Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance [J]. Computer Science, 2022, 49(6A): 523-530.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!