Computer Science ›› 2022, Vol. 49 ›› Issue (11A): 210600142-7.doi: 10.11896/jsjkx.210600142

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Multi-label Vehicle Real-time Recognition Algorithm Based on YOLOv3 and Improved VGGNet

GU Xi-long, GONG Ning-sheng, HU Qian-sheng   

  1. College of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China
  • Online:2022-11-10 Published:2022-11-21
  • About author:GU Xi-long,born in 1993,postgraduate.His main research interests include deep learning and target detection.
    GONG Ning-sheng,born in 1958,Ph.D,professor.His main research interests include mathematical logic,BP neural network,image processing,pattern reco-gnition,data mining and so on.
  • Supported by:
    National Key Basic Research and Development Program(973 Program)(2005CB321901),Mobile Environment Law Enforcement Video Acquisition and Man-agement System Based on High Compression Ratio Technology(ZX16487470001) and Open Project of the State Key Laboratory of Software Development Environment(BUAA-SKLSDE-09KF-03).

Abstract: In order to quickly and effectively identify vehicle information in video,this paper combines the advantages of YOLOv3 algorithm and CNN algorithm to design an algorithm that can identify vehicle multi-label information in real time.Firstly,the high recognition speed and accuracy of YOLOv3 are used to realize real-time monitoring and positioning of vehicles in video stream.After obtaining the vehicle location information,the vehicle information is passed into the improved simplified and optimized VGGNet multi-label classification network to identify the vehicle with multiple tags.Finally,the label information is output to the video stream to obtain real-time multi-label recognition of vehicles in video.The training and test data sets in this paper are derived from KITTI data sets and multi-label data sets obtained through Bing Image Search API.Experimental results show that the mAP of the proposed method on KITTI data set reaches 91.27,the average accuracy of multi-label is more than 80%,and the frame rate of video reaches 35fps.It achieves good results in vehicle identification and multi-label classification on the basis of ensuring real-time performance.

Key words: Computer vision, Vehicle recognition, Multi-label recognition, Target detection, Deep learning

CLC Number: 

  • TP183
[1]ZHANG Q,LI J F,ZHUO L.Review of Vehicle RecognitionTechnology[J].Journal of Beijing University of Technology,2018,44(3):382-392.
[2]BAEK N,PARK S M,KIM K J,et al.Vehicle color classification based on the support vector machine method[C]//International Conference on Intelligent Computing(ICIC).2007:1133-1139.
[3]LI L J,SU H,XING E P,et al.Object bank:a high-level image representation for scene classification & semantic feature sparsification[C]//Advances in Neural Information Processing Systems(ANIPS).2010:1378-1386.
[4]BUCH N,ORWELL J,VELASTIN S A.Detection and classification of vehicles for urban traffic scenes[C]//Visual Information Engineering(VIE).2008:182-187.
[5]RACHMADI R F,PURNAMA I.Vehicle color recognitionusing convolutional neural network[J].arXiv,2015:1510.07391.
[6]KRAUSE J,JIN H,YANG J,et al.Fine-grained recognitionwithout part annotations[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition(CVPR).2015:5546-5555.
[7]HU A,LI H,ZHANG F,et al.Deep Boltzmann machines based vehicle recognition[C]//The 26th Chinese Control and Decision Conference(CCDC).2014:3033-3038.
[8]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].International Conference on Neural Information Processing Systems,2017,37(6):1137-1149.
[9]DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[C]//Proceedings of the 30th International Conference on Neural Information Processing System(NIPS).2016:379-387.
[10]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:779-788.
[11]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//European Conference on Computer Vision(ECCV).2016:21-38.
[12]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2014:580-587.
[13]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,37(9):1904-16.
[14]REDMON J,FARHADIA.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:6517-6525.
[15]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[16]ZHANG Z,QIAO S,XIE C,et al.Single-Shot Object Detection with Enriched Semantics[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:5813-5821.
[17]ZHANG S,WEN L,BIAN X,et al.Single-Shot RefinementNeural Network for Object Detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018,4203-4212.
[18]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Image Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016,770-778.
[19]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition [J].arXiv:1409.1556,2014.
[20]SRIVASTAVA N,HINTON G,KRIZHEVSKYA,et al.Drop-out:A Simple Way to Prevent Neural Networks from Overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.
[21]GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:The KITTI dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237.
[1] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[9] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[10] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[11] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[12] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[13] WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324.
[14] CHU Yu-chun, GONG Hang, Wang Xue-fang, LIU Pei-shun. Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4 [J]. Computer Science, 2022, 49(6A): 337-344.
[15] ZHU Wen-tao, LAN Xian-chao, LUO Huan-lin, YUE Bing, WANG Yang. Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN [J]. Computer Science, 2022, 49(6A): 378-383.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!