计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 210600142-7.doi: 10.11896/jsjkx.210600142

• 图像处理&多媒体技术 • 上一篇    下一篇

基于YOLOv3与改进VGGNet的车辆多标签实时识别算法

顾曦龙, 宫宁生, 胡乾生   

  1. 南京工业大学计算机科学与技术学院 南京 211816
  • 出版日期:2022-11-10 发布日期:2022-11-21
  • 通讯作者: 宫宁生(chinahqs@163.com)
  • 作者简介:(781537596@qq.com)
  • 基金资助:
    国家重点基础研究发展计划(973计划)(2005CB321901);基于高压缩比技术的移动环境执法视频采集与管理系统(ZX16487470001);软件开发环境国家重点实验室开放课题(BUAA-SKLSDE-09KF-03)

Multi-label Vehicle Real-time Recognition Algorithm Based on YOLOv3 and Improved VGGNet

GU Xi-long, GONG Ning-sheng, HU Qian-sheng   

  1. College of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China
  • Online:2022-11-10 Published:2022-11-21
  • About author:GU Xi-long,born in 1993,postgraduate.His main research interests include deep learning and target detection.
    GONG Ning-sheng,born in 1958,Ph.D,professor.His main research interests include mathematical logic,BP neural network,image processing,pattern reco-gnition,data mining and so on.
  • Supported by:
    National Key Basic Research and Development Program(973 Program)(2005CB321901),Mobile Environment Law Enforcement Video Acquisition and Man-agement System Based on High Compression Ratio Technology(ZX16487470001) and Open Project of the State Key Laboratory of Software Development Environment(BUAA-SKLSDE-09KF-03).

摘要: 为了能快速、有效地识别视频中的车辆信息,文中结合YOLOv3算法和CNN算法的优点,设计了一种能实时识别车辆多标签信息的算法。首先,利用具有较高识别速度和准确率的YOLOv3实现对视频流中车辆的实时监测和定位。在获得车辆的位置信息后,再将车辆信息传入经过简化与优化的类VGGNet多标签分类网络中,对车辆进行多标签标识。最后将标签信息输出至视频流,得到对视频中车辆的实时多标签识别。文中训练与测试数据集来源为KITTI数据集和通过Bing Image Search API获取的多标签数据集。实验结果证明,所提方法在KITTI数据集上的mAP达到了91.27,多标签平均准确率达到80%以上,视频帧率达到35 fps,在保证实时性的基础上取得了较好的车辆识别和多标签分类效果。

关键词: 计算机视觉, 车辆识别, 多标签识别, 目标检测, 深度学习

Abstract: In order to quickly and effectively identify vehicle information in video,this paper combines the advantages of YOLOv3 algorithm and CNN algorithm to design an algorithm that can identify vehicle multi-label information in real time.Firstly,the high recognition speed and accuracy of YOLOv3 are used to realize real-time monitoring and positioning of vehicles in video stream.After obtaining the vehicle location information,the vehicle information is passed into the improved simplified and optimized VGGNet multi-label classification network to identify the vehicle with multiple tags.Finally,the label information is output to the video stream to obtain real-time multi-label recognition of vehicles in video.The training and test data sets in this paper are derived from KITTI data sets and multi-label data sets obtained through Bing Image Search API.Experimental results show that the mAP of the proposed method on KITTI data set reaches 91.27,the average accuracy of multi-label is more than 80%,and the frame rate of video reaches 35fps.It achieves good results in vehicle identification and multi-label classification on the basis of ensuring real-time performance.

Key words: Computer vision, Vehicle recognition, Multi-label recognition, Target detection, Deep learning

中图分类号: 

  • TP183
[1]ZHANG Q,LI J F,ZHUO L.Review of Vehicle RecognitionTechnology[J].Journal of Beijing University of Technology,2018,44(3):382-392.
[2]BAEK N,PARK S M,KIM K J,et al.Vehicle color classification based on the support vector machine method[C]//International Conference on Intelligent Computing(ICIC).2007:1133-1139.
[3]LI L J,SU H,XING E P,et al.Object bank:a high-level image representation for scene classification & semantic feature sparsification[C]//Advances in Neural Information Processing Systems(ANIPS).2010:1378-1386.
[4]BUCH N,ORWELL J,VELASTIN S A.Detection and classification of vehicles for urban traffic scenes[C]//Visual Information Engineering(VIE).2008:182-187.
[5]RACHMADI R F,PURNAMA I.Vehicle color recognitionusing convolutional neural network[J].arXiv,2015:1510.07391.
[6]KRAUSE J,JIN H,YANG J,et al.Fine-grained recognitionwithout part annotations[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition(CVPR).2015:5546-5555.
[7]HU A,LI H,ZHANG F,et al.Deep Boltzmann machines based vehicle recognition[C]//The 26th Chinese Control and Decision Conference(CCDC).2014:3033-3038.
[8]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].International Conference on Neural Information Processing Systems,2017,37(6):1137-1149.
[9]DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[C]//Proceedings of the 30th International Conference on Neural Information Processing System(NIPS).2016:379-387.
[10]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:779-788.
[11]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//European Conference on Computer Vision(ECCV).2016:21-38.
[12]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2014:580-587.
[13]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,37(9):1904-16.
[14]REDMON J,FARHADIA.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:6517-6525.
[15]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[16]ZHANG Z,QIAO S,XIE C,et al.Single-Shot Object Detection with Enriched Semantics[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:5813-5821.
[17]ZHANG S,WEN L,BIAN X,et al.Single-Shot RefinementNeural Network for Object Detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018,4203-4212.
[18]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Image Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016,770-778.
[19]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition [J].arXiv:1409.1556,2014.
[20]SRIVASTAVA N,HINTON G,KRIZHEVSKYA,et al.Drop-out:A Simple Way to Prevent Neural Networks from Overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.
[21]GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:The KITTI dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉.
基于边框距离度量的增量目标检测方法
Incremental Object Detection Method Based on Border Distance Measurement
计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[5] 王灿, 刘永坚, 解庆, 马艳春.
基于软标签和样本权重优化的Anchor Free目标检测算法
Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization
计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[6] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[7] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[8] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[9] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[10] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[11] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[12] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[13] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[14] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[15] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!