计算机科学 ›› 2019, Vol. 46 ›› Issue (11): 272-276.doi: 10.11896/jsjkx.180901630
韩佳林1, 王琦琦1, 杨国威1, 陈隽2, 王以忠1
HAN Jia-lin1, WANG Qi-qi1, YANG Guo-wei1, CHEN Jun2, WANG Yi-zhong1
摘要: 目标检测是计算机视觉领域中重要的研究方向。近几年,深度学习在基于视频的目标检测领域取得了突破性研究进展。深度学习强大的特征学习和特征表达能力,使其能够自动学习和提取相关特征并加以利用。然而,复杂的网络结构使得深度学习模型具有参数规模大、计算需求高、占用存储空间大等问题。基于深度神经网络的单发多框检测器(Single-shot Multi-box Detector 300,SSD300)能够对视频中的目标进行实时检测,但无法移植到嵌入式设备或移动终端以满足实际应用中的需求。为了解决该问题,文中提出了一种权重删减和卷积核删减融合的方法。首先,针对深度卷积神经网络模型权重参数过多导致模型过大的问题,采用权重删减的方法移除各卷积层中的冗余权重,确定各层权重的稀疏度;然后,针对卷积层计算量大的问题,根据各卷积层中的权重稀疏度对冗余卷积核进行删减,以减少冗余参数和计算量;最后,对删减后的神经网络进行训练,以恢复其检测精度。为验证该方法的有效性,在卷积神经网络框架caffe平台上对SSD网络模型进行验证。结果表明,压缩加速后的SSD300网络模型的大小为12.5MB,检测速度最高可达50FPS (frames per second)。实验实现了在网络检测准确率下降尽量小的前提下,将SSD300网络压缩了8.4×,加速了2×。权重删减和卷积核删减融合的方法为SSD300网络在视频检测中的智能化应用提供了可行性方案。
中图分类号:
[1] | GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:580-587. |
[2] | GIRSHICK R.Fast R-CNN[C]∥Proceedings of the IEEE Conference on International Conference on Computer Vision.Boston:IEEE,2015:1440-1448. |
[3] | REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149. |
[4] | REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2016:779-788. |
[5] | LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shotmultibox detector[C]∥Proceedings of European Conference on Computer Vision.Amsterdam:Springer International Publishing,2016:21-37. |
[6] | HAN S,POOL J,TRAN J,et al.Learning both weights andconnections for efficient neural networks[M]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,2015:1135-1143. |
[7] | HAN S,POOL J,DALLY W J,et al.Deep Compression:compressing deep neuralnetworks with pruning,trained quantization and huffman coding[C]∥Proceedings of Conference on Learning Representations.San Juan:IEEE,2016:233-242. |
[8] | MOHAMMAD R,VICENTE O,JOSEPH R,et al.XOR-Net:ImageNet Classification Using Binary Convolutional Neural Networks[C]∥Proceedings of European Conference on Computer Vision.Amsterdam:ECCV,2016:525-542. |
[9] | MATTHIEU C,ITAY H,DANIEL S,et al.Binarized NeuralNetworks:Training Neural Networks with Weights and Activations Constrained to +1 or -1[EB/OL].https://arxiv.org/abs/1704.04861.pdf. |
[10] | GEOFFREY H,ORIOL V,JEFF D,et al.Distilling the knowledge in a Neural Network[C]∥Proceedings of Conference on Advances in Neural Infermation Processing Systems.Montreal:IEEE,2014:2644-2652. |
[11] | BHARAT BHUSAN S,VINEETH N.B.Deep Model Compression:Distilling Knowledge from Noisy Teachers[EB/OL].https://arxiv.org/abs/1610.09650.pdf. |
[12] | MAX J,ANDREA V,ANDREW Z,et al.Speeding up Convolutional Neural Networks with Low Rank Expansions[J].Computer Science,2014,4(4):1-7. |
[13] | VIKAS S,TARA N S,SANJIV K,et al.Structured Transforms for Small-Footprint Deep Learning[EB/OL].https://arxiv.org/abs/1510.01722.pdf. |
[14] | WEN W,WU C,WANG Y,et al.Learning structured sparsity in deep neural networks[M]∥Advances in Neural Information Processing Systems.Berlin:Springer,2016:2074-2082. |
[15] | LIU Z,SHEN Z,HUANG G,et al.Learning efficient convolutional networks through network slimming[C]∥Proceedings of the IEEE International Conference on Computer Vision(ICCV).IEEE,2017:2755-2763. |
[16] | HE Y,ZHANG X,SUN J,et al.Channel pruning for accelerating very deep neural networks [EB/OL].https://arxiv.org/abs/1707.06168.pdf. |
[17] | IANDOLA F N,HAN S,MOSKEWICZ M W,et al.SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[C]∥Proceedings of International Conference on Learning Representations.San Juan:ICLR,2016. |
[18] | HOWARD A G,ZHU M,CHEN B,et al.MobileNets:efficient convolutional neural networks for mobile vision applications [EB/OL].https://arxiv.org/abs/1704.04861.pdf. |
[19] | ZHANG X,ZHOU X,LIN M,et al.ShuffleNet:an extremely efficient convolutional neural network for mobile devices [EB/OL].https:// arxiv.org/abs/1707.01083.pdf. |
[20] | EVERINGHAM M,VAN G L,WILLIAMS C K I,et al.Thepascal visual object classes (voc) challenge[J].International journal of computer vision,2010,88(2):303-338. |
[21] | HANSON S J,PRATT L Y.Comparing biases for minimal network construction with back-propagation[M]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,1989:177-185. |
[22] | CUN Y L,DENKER J S,SOLLA S A,et al.Optimal brain damage[C]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,1990:598-605. |
[23] | HASSIBI B,STORK D G.Second Order derivatives for network pruning:optimal brain surgeon[C]∥Neural Information Processing Systems.Morgan Kaufmann Publishers Inc,1992:164-171. |
[24] | HAN Y F,JIANG T H,MA Y P,et al.Compression of deep neural networks [J].Application Research of Computers,2018,35(10):2894-2897.(in Chinese)韩云飞,蒋同海,马玉鹏,等.深度神经网络的压缩研究[J].计算机应用研究,2018,35(10):2894-2897. |
[25] | 焦李成.深度学习、优化与识别[M].北京:清华大学出版社,2017:104. |
[1] | 张艳梅, 楼胤成. 基于深度神经网络的庞氏骗局合约检测方法[J]. 计算机科学, 2021, 48(1): 273-279. |
[2] | 丁子昂, 乐曹伟, 吴玲玲, 付明磊. 基于CEEMD-Pearson和深度LSTM混合模型的PM2.5浓度预测方法[J]. 计算机科学, 2020, 47(6A): 444-449. |
[3] | 尚骏远, 杨乐涵, 何琨. 基于特征可视化分析深度神经网络的内部表征[J]. 计算机科学, 2020, 47(5): 190-197. |
[4] | 唐国强,高大启,阮彤,叶琪,王祺. 融入语言模型和注意力机制的临床电子病历命名实体识别[J]. 计算机科学, 2020, 47(3): 211-216. |
[5] | 樊玮, 刘挺, 黄睿, 郭青, 张宝. 卷积神经网络低层特征辅助的图像实例分割方法[J]. 计算机科学, 2020, 47(11): 186-191. |
[6] | 孔繁钰, 周愉峰, 陈纲. 基于时空特征挖掘的交通流量预测方法[J]. 计算机科学, 2019, 46(7): 322-326. |
[7] | 彭金喜, 苏远歧, 薛笑荣. 基于深度学习和同生矩阵的SAR图像纹理特征检索方法[J]. 计算机科学, 2019, 46(6A): 196-199. |
[8] | 肖锐, 蒋家琪, 张云春. 多义词语义拓扑及有监督的词义消歧研究[J]. 计算机科学, 2019, 46(11A): 13-18. |
[9] | 陈胜, 朱国胜, 祁小云, 雷龙飞, 吴善超, 吴梦宇. 基于深度神经网络的自定义用户异常行为检测[J]. 计算机科学, 2019, 46(11A): 442-445. |
[10] | 张爱英. 基于多语言语音数据选择的资源稀缺蒙语语音识别研究[J]. 计算机科学, 2018, 45(9): 308-313. |
[11] | 郭文生, 包灵, 钱智成, 曹万里. 基于自适应叠合分割与深度神经网络的人数统计方法[J]. 计算机科学, 2018, 45(8): 229-235. |
[12] | 崔璐,张鹏,车进. 基于深度神经网络的遥感图像分类算法综述[J]. 计算机科学, 2018, 45(6A): 50-53. |
[13] | 张爱英,倪崇嘉. 资源稀缺蒙语语音识别研究[J]. 计算机科学, 2017, 44(10): 318-322. |
[14] | 李昆,柴玉梅,赵红领,赵悦淑,南晓斐. 基于深度神经网络的胎儿体重预测[J]. 计算机科学, 2016, 43(Z11): 73-76. |
[15] | 李伟林,文剑,马文凯. 基于深度神经网络的语音识别系统研究[J]. 计算机科学, 2016, 43(Z11): 45-49. |
|