Computer Science ›› 2020, Vol. 47 ›› Issue (8): 261-266.doi: 10.11896/jsjkx.190700062

Previous Articles     Next Articles

Convolutional Neural Networks Compression Based on Pruning and Quantization

SUN Yan-li, YE Jiong-yao   

  1. College of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
  • Online:2020-08-15 Published:2020-08-10
  • About author:SUN Yan-li, born in 1992, postgraduate.Her main research interests include neural network compression and so on.
    YE Jiong-yao, born in 1978, professor, postgraduate’s supervisor.His main research interests include IC/SOC design and low-power research, video, radio and television chip development.

Abstract: With the development of deep learning, Convolutional Neural Networks(CNN), as one of its important algorithms, is widely applied in a variety of fields such as target detection, natural language processing, speech recognition and image identification, and achieves better results than the traditional algorithm.The number of parameters and calculation are increasing with the depth of network structure, resulting in many algorithms must be implemented on GPU.So it is difficult to apply the CNN model to mobile terminals which has limited resources and high real-time request.To solve this problem, this paper presents a method of optimizing network structure and parameters simultaneously.Firstly, the algorithms prunes weight according to its influence on the results of network, and ensure that the redundant information is removed while retaining the important connection of the mo-del.Then, this paper quantizes the float-point weight and activation of CNN.This changes float-point operation to fixed-point ope-ration.It not only reduces the computational complexity of the network model, but also reduces the size of the network model.To verify the algorithm, deep learning framework tensorflow is selected, and Spyder compiler is used in Ubuntu 16.04 operating system.The experimental results show that this method reduces size of LeNet model with simple structure from 1.64M to 0.36M.Compression ratio reaches 78%, but the accuracy drops only 0.016.And it also reduces the size of MobileNet with Lightweight network from 16.9M to 3.1M.Compression ratio reaches 81% with the accuracy dropping only 0.03.The data show that combining the weight pruning and parameter quantification of convolution neural network can effectively compress the convolution neural network within the acceptable range of accuracy loss.So, this method solve the difficulty of deploying the convolution neural network to the mobile terminal.

Key words: Convolutional neural networks, Weight pruning, Quantization-aware-training, Parameters quantization, Network compression

CLC Number: 

  • TP183
[1] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.Imagenet classification with deep convolutional neural networks[C]∥Advances in Neural Information Processing Systems.2012:1097-1105.
[2] SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition.arXiv:1409.1556, 2014.
[3] SZEGEDY C, LIU W, JIA Y, et al.Going deeper with convolutions[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[4] HE K, ZHANG X, REN S, et al.Deep residual learning for ima-ge recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[5] REN S, HE K, GIRSHICK R, et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]∥Advances in Neural Information Processing Systems.2015:91-99.REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:Unified, real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[7] LONG J, SHELHAMER E, DARRELL T.Fully convolutional networks for semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[8] HOWARD A G, ZHU M, CHEN B, et al.Mobilenets:Efficientconvolutional neural networks for mobile vision applications.arXiv:1704.04861, 2017.
[9] IANDOLA F N, HAN S, MOSKEWICZ M W, et al.Squee-zeNet:AlexNet-level accuracy with 50x fewer parameters and< 0.5MB model size.arXiv:1602.07360, 2016.
[10] ZHANG X, ZHOU X, LIN M, et al.Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856.
[11] HUANG G, LIU Z, VAN DER MAATEN L, et al.Denselyconnected convolutional networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[12] HAN S, POOL J, TRAN J, et al.Learning both weights andconnections for efficient neural network[C]∥Advances in Neural Information Processing Systems.2015:1135-1143.
[13] HAN S, MAO H, DALLY W J.Deep compression:Compressing deep neural networks with pruning, trained quantization and huffman coding.arXiv:1510.00149, 2015.
[14] LECUN Y, DENKER J S, SOLLA S A.Optimal brain damage[C]∥Advances in Neural Information Processing Systems.1990:598-605.
[15] LEBEDEV V, LEMPITSKY V.Fast convNets using group-wise brain damage∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2554-2564.COURBARIAUX M, BENGIO Y, DAVID J P.Binaryconnect:Training deep neural networks with binary weights during propa-gations[C]∥Advances in Neural Information Processing Systems.2015:3123-3131.RASTEGARI M, ORDONEZ V, REDMON J, et al.Xnor-net:Imagenet classification using binary convolutional neural networks[C]∥European Conference on Computer Vision.Cham:Springer, 2016:525-542.LI F, ZHANG B, LIU B.Ternary weight, 2016.ZHOU A, YAO A, GUO Y, et al.Incremental network quantization:Towards lossless cnns with low-precision weights .ar-Xiv:1702.03044, 2017.COURBARIAUX M, HUBARA I, SOUDRY D, et al.Binarized Neural Networks:Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1.arXiv:1602.02830, 2016.WANG P, HU Q, ZHANG Y, et al.Two-step quantization for low-bit neural networks[C]∥Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2018:4376-4384.KRISHNAMOORTHI R.Quantizing deep convolutional net-works for efficient inference:A whitepaper.arXiv:1806.08342, 2018.CAI R C, ZHONG C R, YU Y, et al.CNN quantization and com-pression strategy for edge computing applications.Journal of Computer Applications, 2018, 38(9):2449-2454.JACOB B, KLIGYS S, CHEN B, et al.Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2704-2713.GYSEL P, MOTAMEDI M, GHIASI S.Hardware-oriented approximation of convolutional neural networks.arXiv:1604.03168, 2016.HAMMERSTROM D.A VLSI architecture for high-perfor-mance, low-cost, on-chip learning∥1990 IJCNN International Joint Conference on Neural Networks.IEEE, 1990:537-544.IOFFE S, SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift.arXiv:1502.0316.2015.
[1] MA Hai-Jiang. Recommendation Algorithm Based on Convolutional Neural Network and Constrained Probability Matrix Factorization [J]. Computer Science, 2020, 47(6A): 540-545.
[2] ZHENG Zhe, HU Qing-hao, LIU Qing-shan, LENG Cong. Quantizing Weights and Activations in Generative Adversarial Networks [J]. Computer Science, 2020, 47(5): 144-148.
[3] PENG Xian, PENG Yu-xu, TANG Qiang, SONG Yan-qi. Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network [J]. Computer Science, 2020, 47(4): 150-156.
[4] LIU Yu-hong,LIU Shu-ying,FU Fu-xiang. Optimization of Compressed Sensing Reconstruction Algorithms Based on Convolutional Neural Network [J]. Computer Science, 2020, 47(3): 143-148.
[5] HUANG Hong-wei,LIU Yu-jiao,SHEN Zhuo-kai,ZHANG Shao-wei,CHEN Zhi-min,GAO Yang. End-to-end Track Association Based on Deep Learning Network Model [J]. Computer Science, 2020, 47(3): 200-205.
[6] WANG Li-hua,DU Ming-hui,LIANG Ya-ling. Classification Net Based on Angular Feature [J]. Computer Science, 2020, 47(2): 83-87.
[7] FU Xue-yang,SUN Qi,HUANG Yue,DING Xing-hao. Single Image De-raining Method Based on Deep Adjacently Connected Networks [J]. Computer Science, 2020, 47(2): 106-111.
[8] SHAO Yang-xue, MENG Wei, KONG Deng-zhen, HAN Lin-xuan, LIU Yang. Cross-modal Retrieval Method for Special Vehicles Based on Deep Learning [J]. Computer Science, 2020, 47(12): 205-209.
[9] HAN Rui, GU Chun-li, LI Zhe, WU Kang, GAO Feng, SHEN Wen-hai. Global Typhoon Message Collection Method Based on CNN-typhoon Model [J]. Computer Science, 2020, 47(11A): 11-17.
[10] ZHANG Mei-yu, LIU Yue-hui, QIN Xu-jia, WU Liang-wu. Neural Style Transfer Method Based on Laplace Operator to Suppress Artifacts [J]. Computer Science, 2020, 47(11A): 209-214.
[11] HUA Ming, LI Dong-dong, WANG Zhe, GAO Da-qi. End-to-End Speaker Recognition Based on Frame-level Features [J]. Computer Science, 2020, 47(10): 169-173.
[12] SHI Xiao-hong, HUANG Qin-kai, MIAO Jia-xin, SU Zhuo. Edge-preserving Filtering Method Based on Convolutional Neural Networks [J]. Computer Science, 2019, 46(9): 277-283.
[13] JING Jie, CHEN Tan, DU Wen-li, LIU Zhi-kang, YIN Hao. Spatio-temporal Features Extraction of Traffic Based on Deep Neural Network [J]. Computer Science, 2019, 46(11A): 1-4.
[14] ZHANG Rui, ZHAN Yong-song, YANG Ming-hao. Handwritten Drawing Order Recovery Method Based on Endpoint Sequential Prediction [J]. Computer Science, 2019, 46(11A): 264-267.
[15] HAN Jia-lin, WANG Qi-qi, YANG Guo-wei, CHEN Jun, WANG Yi-zhong. SSD Network Compression Fusing Weight and Filter Pruning [J]. Computer Science, 2019, 46(11): 272-276.
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .