计算机科学 ›› 2019, Vol. 46 ›› Issue (11): 272-276.doi: 10.11896/jsjkx.180901630
韩佳林1, 王琦琦1, 杨国威1, 陈隽2, 王以忠1
HAN Jia-lin1, WANG Qi-qi1, YANG Guo-wei1, CHEN Jun2, WANG Yi-zhong1
摘要: 目标检测是计算机视觉领域中重要的研究方向。近几年,深度学习在基于视频的目标检测领域取得了突破性研究进展。深度学习强大的特征学习和特征表达能力,使其能够自动学习和提取相关特征并加以利用。然而,复杂的网络结构使得深度学习模型具有参数规模大、计算需求高、占用存储空间大等问题。基于深度神经网络的单发多框检测器(Single-shot Multi-box Detector 300,SSD300)能够对视频中的目标进行实时检测,但无法移植到嵌入式设备或移动终端以满足实际应用中的需求。为了解决该问题,文中提出了一种权重删减和卷积核删减融合的方法。首先,针对深度卷积神经网络模型权重参数过多导致模型过大的问题,采用权重删减的方法移除各卷积层中的冗余权重,确定各层权重的稀疏度;然后,针对卷积层计算量大的问题,根据各卷积层中的权重稀疏度对冗余卷积核进行删减,以减少冗余参数和计算量;最后,对删减后的神经网络进行训练,以恢复其检测精度。为验证该方法的有效性,在卷积神经网络框架caffe平台上对SSD网络模型进行验证。结果表明,压缩加速后的SSD300网络模型的大小为12.5MB,检测速度最高可达50FPS (frames per second)。实验实现了在网络检测准确率下降尽量小的前提下,将SSD300网络压缩了8.4×,加速了2×。权重删减和卷积核删减融合的方法为SSD300网络在视频检测中的智能化应用提供了可行性方案。
中图分类号:
[1]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:580-587. [2]GIRSHICK R.Fast R-CNN[C]∥Proceedings of the IEEE Conference on International Conference on Computer Vision.Boston:IEEE,2015:1440-1448. [3]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149. [4]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:779-788. [5]LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shotmultibox detector[C]∥Proceedings of European Conference on Computer Vision.Amsterdam:Springer International Publishing,2016:21-37. [6]HAN S,POOL J,TRAN J,et al.Learning both weights andconnections for efficient neural networks[M]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,2015:1135-1143. [7]HAN S,POOL J,DALLY W J,et al.Deep Compression:compressing deep neuralnetworks with pruning,trained quantization and huffman coding[C]∥Proceedings of Conference on Learning Representations.San Juan:IEEE,2016:233-242. [8]MOHAMMAD R,VICENTE O,JOSEPH R,et al.XOR-Net:ImageNet Classification Using Binary Convolutional Neural Networks[C]∥Proceedings of European Conference on Computer Vision.Amsterdam:ECCV,2016:525-542. [9]MATTHIEU C,ITAY H,DANIEL S,et al.Binarized NeuralNetworks:Training Neural Networks with Weights and Activations Constrained to +1 or -1[EB/OL].https://arxiv.org/abs/1704.04861.pdf. [10]GEOFFREY H,ORIOL V,JEFF D,et al.Distilling the knowledge in a Neural Network[C]∥Proceedings of Conference on Advances in Neural Infermation Processing Systems.Montreal:IEEE,2014:2644-2652. [11]BHARAT BHUSAN S,VINEETH N.B.Deep Model Compression:Distilling Knowledge from Noisy Teachers[EB/OL].https://arxiv.org/abs/1610.09650.pdf. [12]MAX J,ANDREA V,ANDREW Z,et al.Speeding up Convolutional Neural Networks with Low Rank Expansions[J].Computer Science,2014,4(4):1-7. [13]VIKAS S,TARA N S,SANJIV K,et al.Structured Transforms for Small-Footprint Deep Learning[EB/OL].https://arxiv.org/abs/1510.01722.pdf. [14]WEN W,WU C,WANG Y,et al.Learning structured sparsity in deep neural networks[M]∥Advances in Neural Information Processing Systems.Berlin:Springer,2016:2074-2082. [15]LIU Z,SHEN Z,HUANG G,et al.Learning efficient convolutional networks through network slimming[C]∥Proceedings of the IEEE International Conference on Computer Vision(ICCV).IEEE,2017:2755-2763. [16]HE Y,ZHANG X,SUN J,et al.Channel pruning for accelerating very deep neural networks [EB/OL].https://arxiv.org/abs/1707.06168.pdf. [17]IANDOLA F N,HAN S,MOSKEWICZ M W,et al.SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[C]∥Proceedings of International Conference on Learning Representations.San Juan:ICLR,2016. [18]HOWARD A G,ZHU M,CHEN B,et al.MobileNets:efficient convolutional neural networks for mobile vision applications [EB/OL].https://arxiv.org/abs/1704.04861.pdf. [19]ZHANG X,ZHOU X,LIN M,et al.ShuffleNet:an extremely efficient convolutional neural network for mobile devices [EB/OL].https:// arxiv.org/abs/1707.01083.pdf. [20]EVERINGHAM M,VAN G L,WILLIAMS C K I,et al.Thepascal visual object classes (voc) challenge[J].International journal of computer vision,2010,88(2):303-338. [21]HANSON S J,PRATT L Y.Comparing biases for minimal network construction with back-propagation[M]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,1989:177-185. [22]CUN Y L,DENKER J S,SOLLA S A,et al.Optimal brain damage[C]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,1990:598-605. [23]HASSIBI B,STORK D G.Second Order derivatives for network pruning:optimal brain surgeon[C]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,1992:164-171. [24]HAN Y F,JIANG T H,MA Y P,et al.Compression of deep neural networks [J].Application Research of Computers,2018,35(10):2894-2897.(in Chinese) 韩云飞,蒋同海,马玉鹏,等.深度神经网络的压缩研究[J].计算机应用研究,2018,35(10):2894-2897. [25]焦李成.深度学习、优化与识别[M].北京:清华大学出版社,2017:104. |
[1] | 焦翔, 魏祥麟, 薛羽, 王超, 段强. 基于深度学习的自动调制识别研究 Automatic Modulation Recognition Based on Deep Learning 计算机科学, 2022, 49(5): 266-278. https://doi.org/10.11896/jsjkx.211000085 |
[2] | 高捷, 刘沙, 黄则强, 郑天宇, 刘鑫, 漆锋滨. 基于国产众核处理器的深度神经网络算子加速库优化 Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor 计算机科学, 2022, 49(5): 355-362. https://doi.org/10.11896/jsjkx.210500226 |
[3] | 范红杰, 李雪冬, 叶松涛. 面向电子病历语义解析的疾病辅助诊断方法 Aided Disease Diagnosis Method for EMR Semantic Analysis 计算机科学, 2022, 49(1): 153-158. https://doi.org/10.11896/jsjkx.201100125 |
[4] | 周欣, 刘硕迪, 潘薇, 陈媛媛. 自然交通场景中的车辆颜色识别 Vehicle Color Recognition in Natural Traffic Scene 计算机科学, 2021, 48(6A): 15-20. https://doi.org/10.11896/jsjkx.200800078 |
[5] | 刘东, 王叶斐, 林建平, 马海川, 杨闰宇. 端到端优化的图像压缩技术进展 Advances in End-to-End Optimized Image Compression Technologies 计算机科学, 2021, 48(3): 1-8. https://doi.org/10.11896/jsjkx.201100134 |
[6] | 潘雨, 邹军华, 王帅辉, 胡谷雨, 潘志松. 基于网络表示学习的深度社团发现方法 Deep Community Detection Algorithm Based on Network Representation Learning 计算机科学, 2021, 48(11A): 198-203. https://doi.org/10.11896/jsjkx.210200113 |
[7] | 马琳, 王云霄, 赵丽娜, 韩兴旺, 倪金超, 张婕. 基于多模型判别的网络入侵检测系统 Network Intrusion Detection System Based on Multi-model Ensemble 计算机科学, 2021, 48(11A): 592-596. https://doi.org/10.11896/jsjkx.201100170 |
[8] | 刘天星, 李伟, 许铮, 张立华, 戚骁亚, 甘中学. 面向高维连续行动空间的蒙特卡罗树搜索算法 Monte Carlo Tree Search for High-dimensional Continuous Control Space 计算机科学, 2021, 48(10): 30-36. https://doi.org/10.11896/jsjkx.201000129 |
[9] | 张艳梅, 楼胤成. 基于深度神经网络的庞氏骗局合约检测方法 Deep Neural Network Based Ponzi Scheme Contract Detection Method 计算机科学, 2021, 48(1): 273-279. https://doi.org/10.11896/jsjkx.191100020 |
[10] | 丁子昂, 乐曹伟, 吴玲玲, 付明磊. 基于CEEMD-Pearson和深度LSTM混合模型的PM2.5浓度预测方法 PM2.5 Concentration Prediction Method Based on CEEMD-Pearson and Deep LSTM Hybrid Model 计算机科学, 2020, 47(6A): 444-449. https://doi.org/10.11896/JsJkx.190700158 |
[11] | 尚骏远, 杨乐涵, 何琨. 基于特征可视化分析深度神经网络的内部表征 Analyzing Latent Representation of Deep Neural Networks Based on Feature Visualization 计算机科学, 2020, 47(5): 190-197. https://doi.org/10.11896/jsjkx.190700128 |
[12] | 唐国强,高大启,阮彤,叶琪,王祺. 融入语言模型和注意力机制的临床电子病历命名实体识别 Clinical Electronic Medical Record Named Entity Recognition Incorporating Language Model and Attention Mechanism 计算机科学, 2020, 47(3): 211-216. https://doi.org/10.11896/jsjkx.190200259 |
[13] | 樊玮, 刘挺, 黄睿, 郭青, 张宝. 卷积神经网络低层特征辅助的图像实例分割方法 Low-level CNN Feature Aided Image Instance Segmentation 计算机科学, 2020, 47(11): 186-191. https://doi.org/10.11896/jsjkx.191200063 |
[14] | 孔繁钰, 周愉峰, 陈纲. 基于时空特征挖掘的交通流量预测方法 Traffic Flow Prediction Method Based on Spatio-Temporal Feature Mining 计算机科学, 2019, 46(7): 322-326. https://doi.org/10.11896/j.issn.1002-137X.2019.07.049 |
[15] | 彭金喜, 苏远歧, 薛笑荣. 基于深度学习和同生矩阵的SAR图像纹理特征检索方法 SAR Image Feature Retrieval Method Based on Deep Learning and Synchronic Matrix 计算机科学, 2019, 46(6A): 196-199. |
|