基于剪枝与量化的卷积神经网络压缩方法

doi:10.11896/jsjkx.190700062

Abstract

Abstract: With the development of deep learning, Convolutional Neural Networks(CNN), as one of its important algorithms, is widely applied in a variety of fields such as target detection, natural language processing, speech recognition and image identification, and achieves better results than the traditional algorithm.The number of parameters and calculation are increasing with the depth of network structure, resulting in many algorithms must be implemented on GPU.So it is difficult to apply the CNN model to mobile terminals which has limited resources and high real-time request.To solve this problem, this paper presents a method of optimizing network structure and parameters simultaneously.Firstly, the algorithms prunes weight according to its influence on the results of network, and ensure that the redundant information is removed while retaining the important connection of the mo-del.Then, this paper quantizes the float-point weight and activation of CNN.This changes float-point operation to fixed-point ope-ration.It not only reduces the computational complexity of the network model, but also reduces the size of the network model.To verify the algorithm, deep learning framework tensorflow is selected, and Spyder compiler is used in Ubuntu 16.04 operating system.The experimental results show that this method reduces size of LeNet model with simple structure from 1.64M to 0.36M.Compression ratio reaches 78%, but the accuracy drops only 0.016.And it also reduces the size of MobileNet with Lightweight network from 16.9M to 3.1M.Compression ratio reaches 81% with the accuracy dropping only 0.03.The data show that combining the weight pruning and parameter quantification of convolution neural network can effectively compress the convolution neural network within the acceptable range of accuracy loss.So, this method solve the difficulty of deploying the convolution neural network to the mobile terminal.

Key words: Convolutional neural networks, Network compression, Parameters quantization, Quantization-aware-training, Weight pruning

CLC Number:

TP183

SUN Yan-li, YE Jiong-yao. Convolutional Neural Networks Compression Based on Pruning and Quantization[J].Computer Science, 2020, 47(8): 261-266.

References

[1]KRIZHEVSKY A, SUTSKEVER I, HINTON G E.Imagenet classification with deep convolutional neural networks[C]∥Advances in Neural Information Processing Systems.2012:1097-1105.
[2]SIMONYAN K, ZISSERMAN A.Very deep convolutional networks for large-scale image recognition.arXiv:1409.1556, 2014.
[3]SZEGEDY C, LIU W, JIA Y, et al.Going deeper with convolutions[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[4]HE K, ZHANG X, REN S, et al.Deep residual learning for ima-ge recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[5]REN S, HE K, GIRSHICK R, et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]∥Advances in Neural Information Processing Systems.2015:91-99.
REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once:Unified, real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[7]LONG J, SHELHAMER E, DARRELL T.Fully convolutional networks for semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[8]HOWARD A G, ZHU M, CHEN B, et al.Mobilenets:Efficientconvolutional neural networks for mobile vision applications.arXiv:1704.04861, 2017.
[9]IANDOLA F N, HAN S, MOSKEWICZ M W, et al.Squee-zeNet:AlexNet-level accuracy with 50x fewer parameters and< 0.5MB model size.arXiv:1602.07360, 2016.
[10]ZHANG X, ZHOU X, LIN M, et al.Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856.
[11]HUANG G, LIU Z, VAN DER MAATEN L, et al.Denselyconnected convolutional networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[12]HAN S, POOL J, TRAN J, et al.Learning both weights andconnections for efficient neural network[C]∥Advances in Neural Information Processing Systems.2015:1135-1143.
[13]HAN S, MAO H, DALLY W J.Deep compression:Compressing deep neural networks with pruning, trained quantization and huffman coding.arXiv:1510.00149, 2015.
[14]LECUN Y, DENKER J S, SOLLA S A.Optimal brain damage[C]∥Advances in Neural Information Processing Systems.1990:598-605.
[15]LEBEDEV V, LEMPITSKY V.Fast convNets using group-wise brain damage∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2554-2564.
COURBARIAUX M, BENGIO Y, DAVID J P.Binaryconnect:Training deep neural networks with binary weights during propa-gations[C]∥Advances in Neural Information Processing Systems.2015:3123-3131.
RASTEGARI M, ORDONEZ V, REDMON J, et al.Xnor-net:Imagenet classification using binary convolutional neural networks[C]∥European Conference on Computer Vision.Cham:Springer, 2016:525-542.
LI F, ZHANG B, LIU B.Ternary weight networks.ar-Xiv:1605.04711, 2016.
ZHOU A, YAO A, GUO Y, et al.Incremental network quantization:Towards lossless cnns with low-precision weights .ar-Xiv:1702.03044, 2017.
COURBARIAUX M, HUBARA I, SOUDRY D, et al.Binarized Neural Networks:Training Deep Neural Networks with Weights and Activations Constrained to +1 or －1.arXiv:1602.02830, 2016.
WANG P, HU Q, ZHANG Y, et al.Two-step quantization for low-bit neural networks[C]∥Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2018:4376-4384.
KRISHNAMOORTHI R.Quantizing deep convolutional net-works for efficient inference:A whitepaper.arXiv:1806.08342, 2018.
CAI R C, ZHONG C R, YU Y, et al.CNN quantization and com-pression strategy for edge computing applications.Journal of Computer Applications, 2018, 38(9):2449-2454.
JACOB B, KLIGYS S, CHEN B, et al.Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2704-2713.
GYSEL P, MOTAMEDI M, GHIASI S.Hardware-oriented approximation of convolutional neural networks.arXiv:1604.03168, 2016.
HAMMERSTROM D.A VLSI architecture for high-perfor-mance, low-cost, on-chip learning∥1990 IJCNN International Joint Conference on Neural Networks.IEEE, 1990:537-544.
IOFFE S, SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift.arXiv:1502.0316.2015.

Related Articles 15

[1]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[2]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[3]	SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[4]	CHEN Zhi-yi, SUI Jie. DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection [J]. Computer Science, 2022, 49(1): 101-107.
[5]	CHEN Zhi-wen, WANG Kun, ZHOU Guang-yun, WANG Xu, ZHANG Xiao-dan, ZHU Hu-ming. SAR Image Change Detection Method Based on Capsule Network with Weight Pruning [J]. Computer Science, 2021, 48(7): 190-198.
[6]	HE Qing-fang, WANG Hui, CHENG Guang. Research on Classification of Breast Cancer Pathological Tissues with Adaptive Small Data Set [J]. Computer Science, 2021, 48(6A): 67-73.
[7]	GAO Chuang, LI Jian-hua, JI Xiu-yi, ZHU Cheng-long, LI Shi-liang, LI Hong-lin. Drug Target Interaction Prediction Method Based on Graph Convolutional Neural Network [J]. Computer Science, 2021, 48(10): 127-134.
[8]	MA Hai-Jiang. Recommendation Algorithm Based on Convolutional Neural Network and Constrained Probability Matrix Factorization [J]. Computer Science, 2020, 47(6A): 540-545.
[9]	ZHENG Zhe, HU Qing-hao, LIU Qing-shan, LENG Cong. Quantizing Weights and Activations in Generative Adversarial Networks [J]. Computer Science, 2020, 47(5): 144-148.
[10]	PENG Xian, PENG Yu-xu, TANG Qiang, SONG Yan-qi. Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network [J]. Computer Science, 2020, 47(4): 150-156.
[11]	LIU Yu-hong,LIU Shu-ying,FU Fu-xiang. Optimization of Compressed Sensing Reconstruction Algorithms Based on Convolutional Neural Network [J]. Computer Science, 2020, 47(3): 143-148.
[12]	HUANG Hong-wei,LIU Yu-jiao,SHEN Zhuo-kai,ZHANG Shao-wei,CHEN Zhi-min,GAO Yang. End-to-end Track Association Based on Deep Learning Network Model [J]. Computer Science, 2020, 47(3): 200-205.
[13]	WANG Li-hua,DU Ming-hui,LIANG Ya-ling. Classification Net Based on Angular Feature [J]. Computer Science, 2020, 47(2): 83-87.
[14]	FU Xue-yang,SUN Qi,HUANG Yue,DING Xing-hao. Single Image De-raining Method Based on Deep Adjacently Connected Networks [J]. Computer Science, 2020, 47(2): 106-111.
[15]	SHAO Yang-xue, MENG Wei, KONG Deng-zhen, HAN Lin-xuan, LIU Yang. Cross-modal Retrieval Method for Special Vehicles Based on Deep Learning [J]. Computer Science, 2020, 47(12): 205-209.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Convolutional Neural Networks Compression Based on Pruning and Quantization

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0