Computer Science ›› 2022, Vol. 49 ›› Issue (6A): 337-344.doi: 10.11896/jsjkx.210600204

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4

CHU Yu-chun1, GONG Hang1, Wang Xue-fang2, LIU Pei-shun1   

  1. 1 School of Computer Science and Technology,Ocean University of China,Qingdao,Shangdong 266100,China
    2 School of Mathematical Sciences,Ocean University of China,Qingdao,Shangdong 266100,China
  • Online:2022-06-10 Published:2022-06-08
  • About author:CHU Yu-chun,born in 1996,postgra-duate.His main research interests include information security and object detection.
    WANG Xue-fang,born in 1975,Ph.D,lecturer.Her main research interests include artificial intelligence and deep learning.

Abstract: Knowledge distillation,as a training method based on the teacher-student network,guides the relatively simple student network to be trained through the complex teacher network,so that the student network can obtain the same precision as the teacher network.It has been widely studied in the field of natural language processing and image classification,while the research in the field of object detection is relatively less,and the experimental effect needs to be improved.The Distillation Algorithm of object detection is mainly carried out in the feature extraction layer,and the distillation method of single feature extraction layer will cause students can't learn the teacher's network knowledge fully,which makes the accuracy of the model poorly.In view of the above problem,this paper uses the “knowledge” in feature extraction,target classification and border prediction of teacher network to guide student network to be trained,and proposes a multi-scale attention Distillation Algorithm to make the know-ledge of teacher network influence student network.Experimental results show that the distillation algorithm proposed in this paper based on YOLOv4 can effectively improve the detection accuracy of the original student network.

Key words: Attention mechanism, Deep learning, Knowledge distillation, Model compression, YOLOv4

CLC Number: 

  • TP391
[1] GIRSHICK R J,DONAHUE T,DARRE L,et al.Region BasedConvolutional Networks for Accurate Object Detection and Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38:142-158.
[2] HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,37(9):1904-1916.
[3] GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[4] REN S Q, HE K M,GIRSHICK R,et al,Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39:1137-1149.
[5] LIN T Y,PIOTOR DOLLA R,GIRSHICK R,et al.FeaturePyramid Networks for Object Detection[C]//Proceedings of The IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[6] CAI Z W,VASCONCELOS N.Cascade R-CNN:Delving intoHigh Quality Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recogntion.2018:6154-6162.
[7] REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:6517-6525.
[8] REDMON J,FARHADI A.YOLOv3:AnIncremental Improve-ment[J].arxiv:1804.02767,2018.
[9] BOCHKOVSKIY A,WANG C Y,LIAO H.YOLOv4:Optimal Speed and Accuracy of Object Detection[J].arXiv:2004.10934,2020.
[10] LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//Proceedings of European Conference on Computer Vision.2016:21-37.
[11] LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of IEEE International Conference on Computer Vision.2017:2980-2988.
[12] BUCILA C,CARUANA R,NICULESCU-MIZIL A.ModelCompression[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD'06).Computer Science Cornell University,2006:535-541.
[13] HINTON G,VINYALS O,DEAN J.Distilling the Knowledge in a Neural Network[J].Computer Science,2015,14(7):38-39.
[14] ZAGORUYKO S,KOMODAKIS N.Paying More Attention to Attention:Improving the Performance of Convolutional Neural Networks via Attention Transfer[J].arXiv:1612.03928,2016.
[15] HOU Y,MA Z,LIU C,et al.Learning Lightweight Lane Detection CNNS By Self AttentionDistillation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1013-1021.
[16] HE T,SHEN C,TIAN Z,et al.KnowledgeAdaptation for Efficient Semantic Segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:578-587.
[17] WANG T,YUAN L,ZHANG X,et al.Distilling Object Detectors With Fine-Grained Feature Imitation[C]//2019IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2020.
[18] GAO M,SHEN Y,LI Q,et al.Residual Knowledge Distillation[J].arXiv:2002.09168,2020.
[19] WANG W,HONG W,WANG F,et al.Gan-knowledge distil-lation for one-stage object detection[J].IEEE Access,2020,8:60719-60727.
[20] CHAWLA A,YIN H,MOLCHANOV P,et al.Data-FreeKnowledge Distillation for Object Detection[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2021:3289-3298.
[21] DAI X,JIANG Z,WU Z,et al.General Instance Distillation for Object Detection[C]//Computer Vision and Pattern Recognition(CVPR).IEEE,2021.
[22] ZHOU D W,MA L Y,TIAN J Y,et al.Super-resolution Reconstruction of Images Based on Feature Fusion Attention Networks[J].Acta Automatica Sinica,2019,57:1-9.
[23] ROMERO A,BALLAS N,KAHOU S E,et al.Fitnets:Hintsfor Thin Deep Nets[J].arXiv:1412.6550,2014.
[24] LI H.Exploring Knowledge Distillation of Deep Neural Nets for Efficient Hardware Solutions[C]//CS230 Report.2018.
[25] EVERINGHAM M,ESLAMI S,GOOL L V,et al.The PASCAL Visual Object Classes Challenge:A Retrospective[J].International Journal of Computer Vision,2015,111(1):98-136.
[1] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[5] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[6] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[7] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[8] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[9] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[10] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[11] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[12] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[13] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[14] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[15] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
Full text



No Suggested Reading articles found!