计算机科学 ›› 2021, Vol. 48 ›› Issue (3): 1-8.doi: 10.11896/jsjkx.201100134

所属专题: 多媒体技术进展

• 多媒体技术进展* 上一篇    下一篇

端到端优化的图像压缩技术进展

刘东, 王叶斐, 林建平, 马海川, 杨闰宇   

  1. 中国科学技术大学电子工程与信息科学系 合肥230027
  • 收稿日期:2020-11-19 修回日期:2020-12-02 出版日期:2021-03-15 发布日期:2021-03-05
  • 通讯作者: 刘东 (dongeliu@ustc.edu.cn)
  • 基金资助:
    国家自然科学基金(61772483)

Advances in End-to-End Optimized Image Compression Technologies

LIU Dong, WANG Ye-fei, LIN Jian-ping, MA Hai-chuan, YANG Run-yu   

  1. Department of Electronic Engineering and Information Science,University of Science and Technology of China,Hefei 230027,China
  • Received:2020-11-19 Revised:2020-12-02 Online:2021-03-15 Published:2021-03-05
  • About author:LIU Dong,born in 1983,Ph.D,professor,is a senior member of China Computer Federation.His main research interests include multimedia signal processing and so on.
  • Supported by:
    National Natural Science Foundation of China (61772483).

摘要: 图像压缩是数据压缩技术在数字图像上的应用,其目的是减少图像数据中的冗余,从而用更加高效的格式存储和传输数据。传统的图像压缩方法中,图像压缩分为预测、变换、量化、熵编码等步骤,每一步均采用人工设计的算法分别进行优化。近年来,基于深度神经网络的端到端图像压缩方法在图像压缩中取得了丰硕的成果,相比传统方法,端到端图像压缩可以进行联合优化,能够取得比传统方法更高的压缩效率。文中首先对端到端图像压缩的方法和网络结构进行了介绍;接着对端到端图像压缩中的关键技术进行了阐述,包括量化技术、概率建模和熵编码技术以及编码端码率分配技术;然后介绍了端到端图像压缩的扩展应用研究,包括可伸缩编码、可变码率压缩、面向视觉感知和机器感知的压缩;最后通过实验对端到端图像压缩方法目前可达到的压缩效率与传统方法进行了对比,展示了其压缩性能。实验结果表明,目前最新的端到端图像压缩方法的压缩效率远高于JPEG,JPEG2000,HEVC intra等传统图像编码方法,相比目前最先进的编码标准VVC intra,在同样的MS-SSIM上节省了高达48.40%的编码码率。

关键词: HEVC, JPEG, JPEG2000, VVC, 端到端优化, 深度神经网络, 图像压缩, 压缩效率

Abstract: Image compression is the application of data compression technologies on digital images,aiming to reduce redundancy in image data,so as to store and transmit data with a more efficient format.In traditional image compression methods,image compression is divided into several steps,such as prediction,transform,quantization and entropy coding,and each step is optimized by manually designed algorithm separately.In recent years,end-to-end image compression methods based on deep neural networks have achieved fruitful results.Compared with the traditional methods,end-to-end image compression can be optimized jointly,which often achieves higher compression efficiency than the traditional methods.In this paper,the end-to-end image compression methods and network structures are introduced,and the key technologies of end-to-end image compression are described,including quantization technology,probability modeling and entropy coding technology,as well as encoder-side bit allocation technology.Then it introduces the research of extended applications of end-to-end image compression,including scalable coding,variable bit rate compression,visual perception and machine perception oriented compression.Finally,the compression efficiency of end-to-end image compression is compared with the traditional methods,and the compression performance is demonstrated.Experimental results show that the compression efficiency of the state-of-the-art end-to-end image compression method is much higher than that of the traditional image coding methods including JPEG,JPEG2000 and HEVC intra.Compared with the newest coding standard VVC intra,the end-to-end image compression method can save up to 48.40% of the coding rate while maintain the same MS-SSIM.

Key words: Compression efficiency, Deep neural network, End-to-end optimization, HEVC, Image compression, JPEG, JPEG2000, VVC

中图分类号: 

  • TN919.81
[1]WALLACE G K.The JPEG still picture compression standard[J].IEEE Transactions on Consumer Electronics,1992,38(1):18-34.
[2]RABBANI M.JPEG2000:Image compression fundamentals,standards and practice[J].Journal of Electronic Imaging,2002,11(2):286.
[3]BPG Image format [CP/OL].https://bellard.org/bpg,2015.
[4]SULLIVAN G J,OHM J R,HAN W J,et al.Overview of the high efficiency video coding (HEVC) standard[J].IEEE Transactions on Circuits and Systems for Video Technology,2012,22(12):1649-1668.
[5]Versatile video coding reference software version 7.1 (VTM-7.1) [CP/OL].https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/tags/VTM-7.1.
[6]TODERICI G,O’MALLEY S M,HWANG S J,et al.Variable rate image compression with recurrent neural networks[J].ar-Xiv:1511.06085,2015.
[7]HUFFMAN D A.A method for the construction of minimum-redundancy codes[J].Proceedings of the IRE,1952,40(9):1098-1101.
[8]MARPE D,SCHWARZ H,WIEGAND T.Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard[J].IEEE Transactions on Circuits and Systems for Video Technology,2003,13(7):620-636.
[9]LEE J,CHO S,KIM M.An end-to-end joint learning scheme of image compression and quality enhancement with improved entropy minimization[J].arXiv:1912.12817,2019.
[10]HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
[11]TODERICI G,VINCENT D,JOHNSTON N,et al.Full resolution image compression with recurrent neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5306-5314.
[12]COVER T M,THOMAS J A.Elements of information theory[M].New York,US:John Wiley & Sons,1991.
[13]BALLÉ J,LAPARRA V,SIMONCELLI E P.End-to-end optimized image compression[J].arXiv:1611.01704,2016.
[14]MINNEN D,BALLÉ J,TODERICI G D.Joint autoregressiveand hierarchical priors for learned image compression[C]//Advances in Neural Information Processing Systems.2018:10771-10780.
[15]BALLÉ J,MINNEN D,SINGH S,et al.Variational image compression with a scale hyperprior[J].arXiv:1802.01436,2018.
[16]JOHNSTON N,VINCENT D,MINNEN D,et al.Improvedlossy image compression with priming and spatially adaptive bit rates for recurrent networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4385-4393.
[17]BALLÉ J,LAPARRA V,SIMONCELLI E P.End-to-end optimization of nonlinear transform codes for perceptual quality[C]//2016 Picture Coding Symposium (PCS).IEEE,2016:1-5.
[18]LEE J,CHO S,BEACK S K.Context-adaptive entropy model for end-to-end optimized image compression[J].arXiv:1809.10452,2018.
[19]LI M,ZHANG K,ZUO W,et al.Learning context-based non-local entropy modeling for image compression[J].arXiv:2005.04661,2020.
[20]MA H C,LIU D,YAN N,et al.End-to-end optimized versatile image compression with wavelet-like transform[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020(99).
[21]LIU J,LU G,HU Z,et al.A unified end-to-end framework for efficient deep image compression[J].arXiv:2002.03370,2020.
[22]ZHONG Z,AKUTSU H,AIZAWA K.Channel-level variablequantization network for deep image compression[J].arXiv:2007.12619,2020.
[23]LI M,ZUO W,GU S,et al.Learning convolutional networks for content-weighted image compression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:3214-3223.
[24]RIPPEL O,BOURDEV L.Real-time adaptive image compression[J].arXiv:1705.05823,2017.
[25]CHEN L H,BAMPIS C G,LI Z,et al.Perceptually optimizing deep image compression[J].arXiv:2007.02711,2020.
[26]LEE J,KIM D,KIM Y,et al.A training method for image compression networks to improve perceptual quality of reconstructions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:144-145.
[27]HU Y,YANG S,YANG W,et al.Towards coding for human and machine vision:A scalable image coding approach[C]//2020 IEEE International Conference on Multimedia and Expo (ICME).IEEE,2020:1-6.
[28]DUAN S,CHEN H,GU J.JPAD-SE:High-level semantics for joint perception-accuracy-distortion enhancement in image compression[J].arXiv:2005.12810,2020.
[29]GUO Z,ZHANG Z,CHEN Z.Deep scalable image compression via hierarchical feature decorrelation[C]//2019 Picture Coding Symposium (PCS).IEEE,2019:1-5.
[30]AKBARI M,LIANG J,HAN J,et al.Learned variable-rate image compression with residual divisive normalization[C]//2020 IEEE International Conference on Multimedia and Expo (ICME).IEEE,2020:1-6.
[31]CHOI Y,EL-KHAMY M,LEE J.Variable rate deep imagecompression with a conditional autoencoder[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:3146-3154.
[32]YANG F,HERRANZ L,VAN DE WEIJER J,et al.Variable rate deep image compression with modulated autoencoder[J].IEEE Signal Processing Letters,2020,27:331-335.
[33]GUO T,WANG J,CUI Z,et al.Variable rate image compression with content adaptive optimization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:122-123.
[34]CHRISTOPOULOS C,ASKELOF J,LARSSON M.Efficientmethods for encoding regions of interest in the upcoming JPEG2000 still image coding standard[J].IEEE Signal Processing Letters,2000,7(9):247-249.
[35]MENTZER F,AGUSTSSON E,TSCHANNEN M,et al.Conditional probability models for deep image compression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4394-4402.
[36]ALEXANDRE D,CHANG C P,PENG W H,et al.Learned image compression with soft bit-based rate-distortion optimization[C]//2019 IEEE International Conference on Image Processing (ICIP).IEEE,2019:1715-1719.
[37]LI M,ZUO W M,GU S H,et al.Learning content-weighteddeep image compression[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,PP(99).
[38]CAMPOS J,MEIERHANS S,DJELOUAH A,et al.Contentadaptive optimization for neural image compression[J].arXiv:1906.01223,2019.
[39]CUI Z,WANG J,BAI B,et al.G-VAE:A continuously variable rate deep image compression framework[J].arXiv:2003.02012,2020.
[40]AKBARI M,LIANG J,HAN J,et al.Learned variable-rate ima-ge compression with residual divisive normalization[C]//2020 IEEE International Conference on Multimedia and Expo (ICME).IEEE,2020:1-6.
[41]CHEN T,MA Z.Variable bitrate image compression with quality scaling factors[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2020:2163-2167.
[42]ZHOU J,NAKAGAWA A,KATO K,et al.Variable rate image compression method with dead-zone quantizer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:162-163.
[43]CAI J,ZHANG L.Deep image compression with iterative non-uniform quantization[C]//2018 25th IEEE International Conference on Image Processing (ICIP).IEEE,2018:451-455.
[44]CHENG Z,SUN H,KATTO J.Low bitrate image compression with discretized Gaussian mixture likelihoods[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:126-127.
[45]CHENG Z,SUN H,TAKEUCHI M,et al.Learned image compression with discretized Gaussian mixture likelihoods and attention modules[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:7939-7948.
[46]WEN S,ZHOU J,NAKAGAWA A,et al.Variational autoencoder based image compression with pyramidal features and context entropy model[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019:1-4.
[47]LADUNE T,PHILIPPE P,HAMIDOUCHE W,et al.Binaryprobability model for learning based image compression[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2020:2168-2172.
[48]GUO Z,WU Y,FENG R,et al.3-D context entropy model for improved practical image compression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:116-117.
[49]LIU H,CHEN T,GUO P,et al.Gated context model with embedded priors for deep image compression[J].arXiv:1902.10480,2019.
[50]LIU H,CHEN T,GUO P,et al.Non-local attention optimized deep image compression[J].arXiv:1904.09757,2019.
[51]LI M,MA K,YOU J,et al.Efficient and effective context-based convolutional entropy modeling for image compression[J].IEEE Transactions on Image Processing,2020,29:5900-5911.
[52]HU Y,YANG W,LIU J.Coarse-to-fine hyper-prior modeling for learned image compression[C]//AAAI.2020:11013-11020.
[53]MINNEN D,SINGH S.Channel-wise autoregressive entropymodels for learned image compression[J].arXiv:2007.08739,2020.
[54]LIU H,CHEN T,SHEN Q,et al.Practical stacked non-local attention modules for image compression[C]//CVPR Workshops.2019:1-4.
[55]CHEN T,LIU H,MA Z,et al.Neural image compression via non-local attention optimization and improved context modeling[J].arXiv:1910.06244,2019.
[56]WU L,HUANG K,SHEN H.A GAN-based tunable imagecompression system[C]//The IEEE Winter Conference on Applications of Computer Vision.2020:2334-2342.
[57]AKUTSU H,NARUKO T.End-to-end learned ROI image compression[C]//CVPR Workshops.2019:1-5.
[58]AKUTSU H,SUZUKI A,ZHONG Z,et al.Ultra low bitrate learned image compression by selective detail decoding[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:118-119.
[59]CAI C,CHEN L,ZHANG X,et al.End-to-end optimized ROI image compression[J].IEEE Transactions on Image Processing,2019,29:3442-3457.
[60]WANG C,HAN Y,WANG W.An end-to-end deep learning image compression framework based on semantic analysis[J].Applied Sciences,2019,9(17):3580.
[61]XIA Q,LIU H,MA Z.Object-based image coding:A learning-driven revisit[C]//2020 IEEE International Conference on Multimedia and Expo (ICME).IEEE,2020:1-6.
[62]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[63]JIA C,LIU Z,WANG Y,et al.Layered image compression using scalable auto-encoder[C]//2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).IEEE,2019:431-436.
[64]ZHANG Z,CHEN Z,LIN J,et al.Learned scalable image compression with bidirectional context disentanglement network[C]//2019 IEEE International Conference on Multimedia and Expo (ICME).IEEE,2019:1438-1443.
[65]AGUSTSSON E,TSCHANNEN M,MENTZER F,et al.Gen-erative adversarial networks for extreme learned image compression[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:221-231.
[66]DASH S,KUMARAVELU G,NAGANOOR V,et al.Com-pressNet:Generative compression at extremely low bitrates[C]//2020 IEEE Winter Conference on Applications of Computer Vision (WACV).IEEE,2020:2314-2322.
[67]PATEL Y,APPALARAJU S,MANMATHA R.Deep perceptual compression[J].arXiv:1907.08310,2019.
[68]KUDO S,ORIHASHI S,TANIDA R,et al.GAN-based image compression using mutual information maximizing regularization[C]//2019 Picture Coding Symposium (PCS).IEEE,2019:1-5.
[69]MENTZER F,TODERICI G,TSCHANNEN M,et al.High-fidelity generative image compression[J].arXiv:2006.09965,2020.
[70]BLAU Y,MICHAELI T.Rethinking lossy compression:Therate-distortion-perception tradeoff[J].arXiv:1901.07821,2019.
[71]LUO J,LI S,DAI W,et al.Noise-to-compression variational autoencoder for efficient end-to-end optimized image coding[C]//2020 Data Compression Conference (DCC).IEEE,2020:33-42.
[72]KODAK E.Kodak lossless true color image suite (1993) [DB/OL].http://r0k.us/graphics/kodak/.
[73]ASUNI N,GIACHETTI A.TESTIMAGES:A large-scale ar-chive for testing visual devices and basic image processing algorithms[C]//Smart Tools and Apps for Graphics - Eurographics Italian Chapter Conference.2014:63-70.
[74]WANG Z,SIMONCELLI E P,BOVIK A C.Multiscale structural similarity for image quality assessment[C]//The Thrity-Seventh Asilomar Conference on Signals,Systems & Computers,2003.IEEE,2003,2:1398-1402.
[1] 焦翔, 魏祥麟, 薛羽, 王超, 段强.
基于深度学习的自动调制识别研究
Automatic Modulation Recognition Based on Deep Learning
计算机科学, 2022, 49(5): 266-278. https://doi.org/10.11896/jsjkx.211000085
[2] 高捷, 刘沙, 黄则强, 郑天宇, 刘鑫, 漆锋滨.
基于国产众核处理器的深度神经网络算子加速库优化
Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor
计算机科学, 2022, 49(5): 355-362. https://doi.org/10.11896/jsjkx.210500226
[3] 范红杰, 李雪冬, 叶松涛.
面向电子病历语义解析的疾病辅助诊断方法
Aided Disease Diagnosis Method for EMR Semantic Analysis
计算机科学, 2022, 49(1): 153-158. https://doi.org/10.11896/jsjkx.201100125
[4] 周欣, 刘硕迪, 潘薇, 陈媛媛.
自然交通场景中的车辆颜色识别
Vehicle Color Recognition in Natural Traffic Scene
计算机科学, 2021, 48(6A): 15-20. https://doi.org/10.11896/jsjkx.200800078
[5] 徐艺菲, 熊淑华, 孙伟恒, 何小海, 陈洪刚.
基于非局部低秩和自适应量化约束先验的HEVC后处理算法
HEVC Post-processing Algorithm Based on Non-local Low-rank and Adaptive Quantization Constraint Prior
计算机科学, 2021, 48(5): 155-162. https://doi.org/10.11896/jsjkx.200800079
[6] 潘雨, 邹军华, 王帅辉, 胡谷雨, 潘志松.
基于网络表示学习的深度社团发现方法
Deep Community Detection Algorithm Based on Network Representation Learning
计算机科学, 2021, 48(11A): 198-203. https://doi.org/10.11896/jsjkx.210200113
[7] 马琳, 王云霄, 赵丽娜, 韩兴旺, 倪金超, 张婕.
基于多模型判别的网络入侵检测系统
Network Intrusion Detection System Based on Multi-model Ensemble
计算机科学, 2021, 48(11A): 592-596. https://doi.org/10.11896/jsjkx.201100170
[8] 刘天星, 李伟, 许铮, 张立华, 戚骁亚, 甘中学.
面向高维连续行动空间的蒙特卡罗树搜索算法
Monte Carlo Tree Search for High-dimensional Continuous Control Space
计算机科学, 2021, 48(10): 30-36. https://doi.org/10.11896/jsjkx.201000129
[9] 张艳梅, 楼胤成.
基于深度神经网络的庞氏骗局合约检测方法
Deep Neural Network Based Ponzi Scheme Contract Detection Method
计算机科学, 2021, 48(1): 273-279. https://doi.org/10.11896/jsjkx.191100020
[10] 丁子昂, 乐曹伟, 吴玲玲, 付明磊.
基于CEEMD-Pearson和深度LSTM混合模型的PM2.5浓度预测方法
PM2.5 Concentration Prediction Method Based on CEEMD-Pearson and Deep LSTM Hybrid Model
计算机科学, 2020, 47(6A): 444-449. https://doi.org/10.11896/JsJkx.190700158
[11] 尚进跃, 毕秀丽, 肖斌, 李伟生.
基于DCT系数哈希的图像篡改检测算法
Image Forgery Detection Based on DCT Coefficients Hashing
计算机科学, 2020, 47(6): 310-315. https://doi.org/10.11896/jsjkx.190600081
[12] 尚骏远, 杨乐涵, 何琨.
基于特征可视化分析深度神经网络的内部表征
Analyzing Latent Representation of Deep Neural Networks Based on Feature Visualization
计算机科学, 2020, 47(5): 190-197. https://doi.org/10.11896/jsjkx.190700128
[13] 唐国强,高大启,阮彤,叶琪,王祺.
融入语言模型和注意力机制的临床电子病历命名实体识别
Clinical Electronic Medical Record Named Entity Recognition Incorporating Language Model and Attention Mechanism
计算机科学, 2020, 47(3): 211-216. https://doi.org/10.11896/jsjkx.190200259
[14] 蔡于涵,熊淑华,孙伟恒,Karn Pradeep,何小海.
基于运动矢量细化的帧率上变换与HEVC结合的视频压缩算法
Video Compression Algorithm Combining Frame Rate Up-conversion with HEVC Standard Based on Motion Vector Refinement
计算机科学, 2020, 47(2): 76-82. https://doi.org/10.11896/jsjkx.190500092
[15] 樊玮, 刘挺, 黄睿, 郭青, 张宝.
卷积神经网络低层特征辅助的图像实例分割方法
Low-level CNN Feature Aided Image Instance Segmentation
计算机科学, 2020, 47(11): 186-191. https://doi.org/10.11896/jsjkx.191200063
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!