计算机科学 ›› 2019, Vol. 46 ›› Issue (7): 233-237.doi: 10.11896/j.issn.1002-137X.2019.07.035

• 图形图像与模式识别 • 上一篇    下一篇

面向行车视频目标实时检测的轻量级SSD网络

张琳娜1,陈建强1,陈晓玲1,岑翼刚2,阚世超2   

  1. (贵州大学机械工程学院 贵阳550025)1
    (北京交通大学计算机与信息技术学院 北京100044)2
  • 收稿日期:2018-06-18 出版日期:2019-07-15 发布日期:2019-07-15
  • 作者简介:张琳娜 女,硕士,讲师,主要研究方向为计算机视觉、机械故障诊断;陈建强 男,硕士,副教授,主要研究方向为机械设计、故障诊断;陈晓玲女,硕士,讲师,主要研究方向为机械设计、故障诊断;岑翼刚 男,博士,教授,主要研究方向为计算机视觉、图像处理、信号处理,E-mail:ygcen@bjtu.edu.cn(通信作者);阚世超 男,博士生,主要研究方向为深度学习、计算机视觉。
  • 基金资助:
    贵州省自然科学基金(黔科合基础1064), 国家自然科学基金 (61872034),广州市科技计划项目(201804010271), 广东省自然科学基金(2016A030313708)资助

Lightweight SSD Network for Real-time Object Detection in Automotive Videos

ZHANG Lin-na1,CHEN Jian-qiang1,CHEN Xiao-ling1,CEN Yi-gang2,KAN Shi-chao2   

  1. (School of Mechanical Engineering,Guizhou University,Guiyan 550025,China)1
    (School of Computer Science & Information Technology,Beijing Jiaotong University,Beijing 100044,China)2
  • Received:2018-06-18 Online:2019-07-15 Published:2019-07-15

摘要: 车辆和行人检测是高级辅助驾驶(ADAS)中最基本也是研究最广泛的内容,而深度学习算法是当前性能最好的目标检测算法。然而,深度学习算法的计算量非常大,通常需要高性能的GPU显卡才能快速运行。在实际使用中,目标检测算法一般要求集成到车辆硬件系统中,因此算法对硬件资源的要求不能太高。基于SSD网络,提出一种轻量级的SSD网络,用于实时目标检测。通过减小输入图像的大小以及全连接层节点数量,减少网络复杂度,提升目标实时检测速度。计算量减少将导致检测车辆和行人的准确率下降,因此提出多级损失函数监督训练方法,来解决输入图像缩小而引发的图像损失及在反向传播过程中不能有效更新VGG中浅层卷积层参数等问题。此外,提出一种基于多尺度图像分块的训练数据集扩充方法,以解决图像缩放产生的形变及图像缩小后目标可能消失的问题。实验结果表明,采用所提出的轻量级SSD网络,不但实现了笔记本电脑上的车辆和行人检测的实时性,也保持了检测准确率。对比其他目标检测算法,优化后的网络对行车视频中车辆和行人的检测速度优于其他算法,且在获得相同准确率的同时消耗的电量更少。

关键词: SSD, 高级辅助驾驶, 卷积神经网络, 目标检测, 深度学习

Abstract: Vehicle and pedestrian detection are the most basic and widely studied subjectin the field of advanced driver-assistance systems (ADAS).At present,deep learning achieved the best detection performance for object detection.However,the computational cost of deep learning algorithms is very high and the algorithms often require high perfor-mance GPU.In the real applications,object detection algorithm is required to be integrated into the vehicle hardware system.So the requirement of the hardware for the algorithm can not be too high.Based on the SSD network,a lightweight SSD network was proposed for real-time objection.By resizing the input images into a smaller size and significantly reducing the node number of the fully connected layer,the network complexity could be reduced.In addition,the object detection speed was improved.A supervised training method based on the multi-stage loss function was proposed to solve the problems of image deformation and the updated parameters in the VGG low layers caused by the shrink of the input images.Furthermore,because the detection accuracy of vehicles and pedestrians would be declined after the reduction of calculations,a hierarchical image partition method was proposed to expand the training dataset,which was able to solve the object vanishing problem caused by the image shrink.Experimental results show that the proposed lightweight SSD network not only realizes real-time vehicle and pedestrian detection on a laptop,but also maintains the detection accuracy.Compared with other object detection algorithms,the optimized network achieves faster detection speed for the vehicles and pedestrians.Also,the power consuming of the laptop is reduced significantly while the detection accuracy is the same.

Key words: Advanced driver-assistance systems, Convolutional neural network, Deep learning, Object detection, SSD

中图分类号: 

  • TP391.44
[1]GIRSHICKR B,DONAHUE J,DARRELL T,et al.Region- based convolutional networks for accurate object detection and segmentation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(1):142-158.
[2]GIRSHICKR B.Fast R-CNN[C]∥2015 IEEE International Conference on Computer Vision,ICCV 2015.Santiago,Chile,2015:1440-1448.
[3]REN S,HE K,GIRSHICKR B,et al.Faster R-CNN:towards real-time object detection with region proposal networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[4]LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shot multibox detector[C]∥Computer Vision-ECCV 2016-14th European Conference,Amsterdam,The Netherlands,2016:21-37.
[5]REDMON J,DIVVALAS K,GIRSHICKR B,et al.You only look once:unified,real-time object detection[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition,CVPR 2016.Las Vegas,NV,USA,2016:779-788.
[6]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]∥International Conference on Learning Representations.San Diego,USA,2015:2015-2029.
[7]CHEN S,PEI H,LAI Q,et al.Multitarget Tracking Control for Coupled Heterogeneous Inertial Agents Systems Based on Flocking Behavior[J].IEEE Transactions on Systems Man & Cybernetics Systems,2018,PP(99):1-7.
[8]DAI J,LI Y,HE K,et al.R-FCN:object detection via region-based fully convolutional networks[C]∥Advances in Neural Information Processing Systems 29:Annual Conference on Neural Information Processing Systems 2016.Barcelona,Spain,2016:379-387.
[9]REDMON J,FARHADI A.YOLO9000:better,faster,stronger[C]∥IEEE Conference on Computer Vision and Pattern Recognition,CVPR 2017.Honolulu,HI,USA,2017:6517-6525.
[10]KIM K H,HONG S,ROH B,et al.PVANET:deep but lightweight neural networks for real-time object detection[J].arXiv:1608.08021.
[11]DAI J,QI H,XIONG Y,et al.Deformable convolutional net- works[J].CoRR,abs/1703.06211,1(2),3.
[12]HUANG J,RATHOD V,SUN C,et al.Speed/accuracy trade-offs for modern convolutional object detectors[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii,USA,2017:3296-3297.
[13]KANG K,LI H,XIAO T,et al.Object detection in videos with tubelet proposal networks[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii,USA,2017:889-897.
[14]KAN S C,CEN Y G,CEN Y,et al.SURF binarization and fast codebook construction for image retrieval [J].Journal of Visual Communication & Image Representation,2017,49:104-114.
[15]GEIGER A.Are we ready for autonomous driving? The KITTI vision benchmark suite[C]∥IEEE Conference on Computer Vision and Pattern Recognition,Providence,RI,USA,2012:3354-3361.
[16]YUAN Y,YANG K,ZHANG C.Hard-aware deeply cascaded embedding[C]∥IEEE International Conference on Computer Vision,Venice,Italy,2017:814-823.
[17]EVERINGHAM M,GOOL L,WILLIAMS C K,et al.The Pascal Visual Object Classes (VOC) Challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[18]LI H,HUANG Y,ZHANG Z.An improved Faster R-CNN for same object retrieval[J].IEEE Access,2017,5:13665-13676.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[6] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[7] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[8] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[9] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[10] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[11] 刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉.
基于边框距离度量的增量目标检测方法
Incremental Object Detection Method Based on Border Distance Measurement
计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[12] 王灿, 刘永坚, 解庆, 马艳春.
基于软标签和样本权重优化的Anchor Free目标检测算法
Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization
计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[13] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[14] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[15] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!