Computer Science ›› 2020, Vol. 47 ›› Issue (8): 261-266.doi: 10.11896/jsjkx.190700062

Previous Articles     Next Articles

Convolutional Neural Networks Compression Based on Pruning and Quantization

SUN Yan-li, YE Jiong-yao   

  1. College of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
  • Online:2020-08-15 Published:2020-08-10
  • About author:SUN Yan-li, born in 1992, postgraduate.Her main research interests include neural network compression and so on.
    YE Jiong-yao, born in 1978, professor, postgraduate’s supervisor.His main research interests include IC/SOC design and low-power research, video, radio and television chip development.

Abstract: With the development of deep learning, Convolutional Neural Networks(CNN), as one of its important algorithms, is widely applied in a variety of fields such as target detection, natural language processing, speech recognition and image identification, and achieves better results than the traditional algorithm.The number of parameters and calculation are increasing with the depth of network structure, resulting in many algorithms must be implemented on GPU.So it is difficult to apply the CNN model to mobile terminals which has limited resources and high real-time request.To solve this problem, this paper presents a method of optimizing network structure and parameters simultaneously.Firstly, the algorithms prunes weight according to its influence on the results of network, and ensure that the redundant information is removed while retaining the important connection of the mo-del.Then, this paper quantizes the float-point weight and activation of CNN.This changes float-point operation to fixed-point ope-ration.It not only reduces the computational complexity of the network model, but also reduces the size of the network model.To verify the algorithm, deep learning framework tensorflow is selected, and Spyder compiler is used in Ubuntu 16.04 operating system.The experimental results show that this method reduces size of LeNet model with simple structure from 1.64M to 0.36M.Compression ratio reaches 78%, but the accuracy drops only 0.016.And it also reduces the size of MobileNet with Lightweight network from 16.9M to 3.1M.Compression ratio reaches 81% with the accuracy dropping only 0.03.The data show that combining the weight pruning and parameter quantification of convolution neural network can effectively compress the convolution neural network within the acceptable range of accuracy loss.So, this method solve the difficulty of deploying the convolution neural network to the mobile terminal.

Key words: Convolutional neural networks, Network compression, Parameters quantization, Quantization-aware-training, Weight pruning

CLC Number: 

  • TP183
[1]KRIZHEVSKY A, SUTSKEVER I, HINTON G E.Imagenet classification with deep convolutional neural networks[C]∥Advances in Neural Information Processing Systems.2012:1097-1105.
[2]SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition.arXiv:1409.1556, 2014.
[3]SZEGEDY C, LIU W, JIA Y, et al.Going deeper with convolutions[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[4]HE K, ZHANG X, REN S, et al.Deep residual learning for ima-ge recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[5]REN S, HE K, GIRSHICK R, et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]∥Advances in Neural Information Processing Systems.2015:91-99.
REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:Unified, real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[7]LONG J, SHELHAMER E, DARRELL T.Fully convolutional networks for semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[8]HOWARD A G, ZHU M, CHEN B, et al.Mobilenets:Efficientconvolutional neural networks for mobile vision applications.arXiv:1704.04861, 2017.
[9]IANDOLA F N, HAN S, MOSKEWICZ M W, et al.Squee-zeNet:AlexNet-level accuracy with 50x fewer parameters and< 0.5MB model size.arXiv:1602.07360, 2016.
[10]ZHANG X, ZHOU X, LIN M, et al.Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856.
[11]HUANG G, LIU Z, VAN DER MAATEN L, et al.Denselyconnected convolutional networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[12]HAN S, POOL J, TRAN J, et al.Learning both weights andconnections for efficient neural network[C]∥Advances in Neural Information Processing Systems.2015:1135-1143.
[13]HAN S, MAO H, DALLY W J.Deep compression:Compressing deep neural networks with pruning, trained quantization and huffman coding.arXiv:1510.00149, 2015.
[14]LECUN Y, DENKER J S, SOLLA S A.Optimal brain damage[C]∥Advances in Neural Information Processing Systems.1990:598-605.
[15]LEBEDEV V, LEMPITSKY V.Fast convNets using group-wise brain damage∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2554-2564.
COURBARIAUX M, BENGIO Y, DAVID J P.Binaryconnect:Training deep neural networks with binary weights during propa-gations[C]∥Advances in Neural Information Processing Systems.2015:3123-3131.
RASTEGARI M, ORDONEZ V, REDMON J, et al.Xnor-net:Imagenet classification using binary convolutional neural networks[C]∥European Conference on Computer Vision.Cham:Springer, 2016:525-542.
LI F, ZHANG B, LIU B.Ternary weight networks.ar-Xiv:1605.04711, 2016.
ZHOU A, YAO A, GUO Y, et al.Incremental network quantization:Towards lossless cnns with low-precision weights .ar-Xiv:1702.03044, 2017.
COURBARIAUX M, HUBARA I, SOUDRY D, et al.Binarized Neural Networks:Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1.arXiv:1602.02830, 2016.
WANG P, HU Q, ZHANG Y, et al.Two-step quantization for low-bit neural networks[C]∥Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2018:4376-4384.
KRISHNAMOORTHI R.Quantizing deep convolutional net-works for efficient inference:A whitepaper.arXiv:1806.08342, 2018.
CAI R C, ZHONG C R, YU Y, et al.CNN quantization and com-pression strategy for edge computing applications.Journal of Computer Applications, 2018, 38(9):2449-2454.
JACOB B, KLIGYS S, CHEN B, et al.Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2704-2713.
GYSEL P, MOTAMEDI M, GHIASI S.Hardware-oriented approximation of convolutional neural networks.arXiv:1604.03168, 2016.
HAMMERSTROM D.A VLSI architecture for high-perfor-mance, low-cost, on-chip learning∥1990 IJCNN International Joint Conference on Neural Networks.IEEE, 1990:537-544.
IOFFE S, SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift.arXiv:1502.0316.2015.
[1] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[2] WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[3] SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[4] CHEN Zhi-yi, SUI Jie. DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection [J]. Computer Science, 2022, 49(1): 101-107.
[5] CHEN Zhi-wen, WANG Kun, ZHOU Guang-yun, WANG Xu, ZHANG Xiao-dan, ZHU Hu-ming. SAR Image Change Detection Method Based on Capsule Network with Weight Pruning [J]. Computer Science, 2021, 48(7): 190-198.
[6] HE Qing-fang, WANG Hui, CHENG Guang. Research on Classification of Breast Cancer Pathological Tissues with Adaptive Small Data Set [J]. Computer Science, 2021, 48(6A): 67-73.
[7] GAO Chuang, LI Jian-hua, JI Xiu-yi, ZHU Cheng-long, LI Shi-liang, LI Hong-lin. Drug Target Interaction Prediction Method Based on Graph Convolutional Neural Network [J]. Computer Science, 2021, 48(10): 127-134.
[8] MA Hai-Jiang. Recommendation Algorithm Based on Convolutional Neural Network and Constrained Probability Matrix Factorization [J]. Computer Science, 2020, 47(6A): 540-545.
[9] ZHENG Zhe, HU Qing-hao, LIU Qing-shan, LENG Cong. Quantizing Weights and Activations in Generative Adversarial Networks [J]. Computer Science, 2020, 47(5): 144-148.
[10] PENG Xian, PENG Yu-xu, TANG Qiang, SONG Yan-qi. Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network [J]. Computer Science, 2020, 47(4): 150-156.
[11] LIU Yu-hong,LIU Shu-ying,FU Fu-xiang. Optimization of Compressed Sensing Reconstruction Algorithms Based on Convolutional Neural Network [J]. Computer Science, 2020, 47(3): 143-148.
[12] HUANG Hong-wei,LIU Yu-jiao,SHEN Zhuo-kai,ZHANG Shao-wei,CHEN Zhi-min,GAO Yang. End-to-end Track Association Based on Deep Learning Network Model [J]. Computer Science, 2020, 47(3): 200-205.
[13] WANG Li-hua,DU Ming-hui,LIANG Ya-ling. Classification Net Based on Angular Feature [J]. Computer Science, 2020, 47(2): 83-87.
[14] FU Xue-yang,SUN Qi,HUANG Yue,DING Xing-hao. Single Image De-raining Method Based on Deep Adjacently Connected Networks [J]. Computer Science, 2020, 47(2): 106-111.
[15] SHAO Yang-xue, MENG Wei, KONG Deng-zhen, HAN Lin-xuan, LIU Yang. Cross-modal Retrieval Method for Special Vehicles Based on Deep Learning [J]. Computer Science, 2020, 47(12): 205-209.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!