一种新颖的单目视觉深度学习算法:H_SFPN

doi:10.11896/jsjkx.200400090

计算机科学 ›› 2021, Vol. 48 ›› Issue (4): 130-137.doi: 10.11896/jsjkx.200400090

• 计算机图形学&多媒体 • 上一篇下一篇

一种新颖的单目视觉深度学习算法:H_SFPN

石先让¹, 宋廷伦^1,2, 唐得志², 戴振泳¹

1 南京航空航天大学能源与动力学院南京210001
2 奇瑞前瞻与预研技术中心安徽芜湖241006

收稿日期:2020-06-24 修回日期:2020-07-29 出版日期:2021-04-15 发布日期:2021-04-09
通讯作者: 宋廷伦(songtinglun@nuaa.edu.cn)
基金资助:
安徽省发改委重大研发项目

Novel Deep Learning Algorithm for Monocular Vision:H_SFPN

SHI Xian-rang¹, SONG Ting-lun^1,2, TANG De-zhi², DAI Zhen-yong¹

1 College of Energy and Power Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing 210001,China
2 Chery Advanced Engineering & Technology Center,Wuhu,Anhui 241006,China

Received:2020-06-24 Revised:2020-07-29 Online:2021-04-15 Published:2021-04-09
About author:SHI Xian-rang,born in 1996,postgra-duate.His main research interests include autonomous driving,object detection and pattern recognition.(nuaasxr@163.com)
SONG Ting-lun,born in 1965,Ph.D,professor,Ph.D supervisor.His main research interests include simulation driven vehicle architecture design and development,autonomous driving vehicles,and data driven energy management strategies for new energy vehicles.
Supported by:
Anhui Provincial Development and Reform Commission’s Major R&D Project.

摘要/Abstract

摘要： 针对单目视觉目标检测,提出了一种基于single-stage深度学习的H_SFPN算法。该算法与现有的YOLOv3和CenterNet算法相比,在保证实时性能的条件下,可有效提高小目标检测的准确度。首先设计了一种新的网络架构(backbone),这种架构通过改进的沙漏(Hourglass)网络模型来提取特征图,以便充分利用底层特征的高分辨率以及高层特征的高语义信息。然后在特征图融合阶段提出了基于SFPN的特征图加权融合方法。最后,H_SFPN算法对目标位置和大小的损失函数进行了改进,可有效降低训练误差,并加快收敛速度。由MSCOCO数据集上的实验结果可知,所提H_SFPN算法明显优于Faster-RCNN,YOLOv3以及EfficientDet等现有的主流深度学习目标检测算法,其中对小目标的检测指标AP_s最高,达到了32.7。

关键词: 加权融合, 目标检测, 深度卷积神经网络, 损失函数, 网络架构

Abstract: This paper proposes a single-stage deep learning based H_SFPN algorithm for monocular visual object detection.Compared with the existing YOLOv3 and CenterNet algorithms,the proposed algorithm can effectively improve the accuracy of small object detection without sacrificing the real-time performance.This paper designs a new network architecture (backbone),which uses an improved Hourglass network model to extract feature maps in order to make full use of the high resolution of the underlying features and the high semantic information of the high-level features.Then in the feature map fusion stage,a method SFPN based on the weighted fusion of feature maps is proposed.Finally,the proposed H_SFPN algorithm improves the loss function of the object position and size,which can effectively reduce the training error and accelerate the convergence speed.According to the experimental results on the MSCOCO data set,the proposed H_SFPN algorithm is significantly better than the existing mainstream deep learning object detection algorithms such as Faster-RCNN,YOLOv3 and EfficientDet.Among them,the small object detection index AP_s of this algorithm is the highest,reaching 32.7.

Key words: Backbone, Deep convolutional neural network, Loss function, Object detection, Weighted fusion

中图分类号:

TP391.41

石先让, 宋廷伦, 唐得志, 戴振泳. 一种新颖的单目视觉深度学习算法:H_SFPN[J]. 计算机科学, 2021, 48(4): 130-137. https://doi.org/10.11896/jsjkx.200400090

SHI Xian-rang, SONG Ting-lun, TANG De-zhi, DAI Zhen-yong. Novel Deep Learning Algorithm for Monocular Vision:H_SFPN[J]. Computer Science, 2021, 48(4): 130-137. https://doi.org/10.11896/jsjkx.200400090

参考文献

[1]DALAL N.Histograms of Oriented Gradients for Human Detection[C]//IEEE Conference on Computer Vision & Pattern Recognition.San Diego,2005:886-893.
[2]LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[3]CORTES C,VAPNIK V N.Support-Vector Networks[J].Machine Learning,1995,20(3):273-297.
[4]ROSENBERG C,HEBERT M,SCHNEIDERMAN H.Semi-Supervised Self-Training of Object Detection Models[C]//IEEE Workshops on Application of Computer Vision.Breckenridge,2005:29-36.
[5]HINTON G E,OSINDERO S,TEH Y W.A fast learning algorithm for deep belief nets[J].Neural Computation,2006,18(7):1527-1554.
[6]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Image Net-classification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[7]DENG J,DONG W,SOCHER R,et al.ImageNet:A large-scale hierarchical image database[C]//IEEE Conference on Computer Vision & Pattern Recognition.2009:248-255.
[8]GIRSHICK R.Fast R-CNN[C]//IEEE International Confe-rence on Computer Vision.2015:1440-1448.
[9]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[10]KAIMING H,GEORGIA G,PIOTR D,et al.Mask R-CNN[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017:2961-2969.
[11]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//European Conference on Computer Vision.2016:21-37.
[12]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//IEEE Confe-rence on Computer Vision& Pattern Recognition.2016:779-788.
[13]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//IEEE Conference on Computer Vision & Pattern Recognition.2017:6517-6525.
[14]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].arXiv:1804.02767,2018.
[15]TAN M,PANG R,LEQ V.EfficientDet:Scalable and Efficient Object Detection[J].arXiv:1911.09070.
[16]LIN T Y,DOLLAR,PIOT R,et al.Feature Pyramid Networks for Object Detection[C]//IEEE Conference on Computer Vision &Pattern Recognition.2017:4-9.
[17]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Image Recognition[C]//IEEE Conference on Computer Vision & Pattern Recognition.2016:770-778.
[18]ZHOU X Y,WANG D Q,KRHENBUHL P.Objects as Points[J].arXiv:1904.07850,2019.
[19]NEWELL A,YANG K,DENG J.Stacked Hourglass Networks for Human Pose Estimation[C]//European Conference on Computer Vision.Springer,Charm,2016:483-499.
[20]YU F,WANG D,SHELHAMER E,et al.Deep Layer Aggregation[J].arXiv:1707.06484,2017.
[21]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017(99):2999-3007.
[22]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:Common Objects in Context[C]//European Conference on Computer Vision.2014:740-755.

相关文章 15

[1]	刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉. 基于边框距离度量的增量目标检测方法 Incremental Object Detection Method Based on Border Distance Measurement 计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[2]	王灿, 刘永坚, 解庆, 马艳春. 基于软标签和样本权重优化的Anchor Free目标检测算法 Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization 计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[3]	孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强. 基于向量注意力机制GoogLeNet-GMP的行人重识别方法 Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism 计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198
[4]	高荣华, 白强, 王荣, 吴华瑞, 孙想. 改进注意力机制的多叉树网络多作物早期病害识别方法 Multi-tree Network Multi-crop Early Disease Recognition Method Based on Improved Attention Mechanism 计算机科学, 2022, 49(6A): 363-369. https://doi.org/10.11896/jsjkx.210500044
[5]	祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋. 改进Faster R-CNN的光学遥感飞机目标检测 Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN 计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[6]	马宾, 付永康, 王春鹏, 李健, 王玉立. 基于GDIoU损失函数的YOLOv4绝缘子高效定位算法 High Performance Insulators Location Scheme Based on YOLOv4 with GDIoU Loss Function 计算机科学, 2022, 49(6A): 412-417. https://doi.org/10.11896/jsjkx.210600089
[7]	陈永平, 朱建清, 谢懿, 吴含笑, 曾焕强. 基于外接圆半径差损失的实时安全帽检测算法 Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss 计算机科学, 2022, 49(6A): 424-428. https://doi.org/10.11896/jsjkx.220100252
[8]	孙洁琪, 李亚峰, 张文博, 刘鹏辉. 基于离散小波变换的双域特征融合深度卷积神经网络 Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation 计算机科学, 2022, 49(6A): 434-440. https://doi.org/10.11896/jsjkx.210900199
[9]	陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳. 三维城市场景中的小物体检测 Small Object Detection in 3D Urban Scenes 计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174
[10]	胡伏原, 万新军, 沈鸣飞, 徐江浪, 姚睿, 陶重犇. 深度卷积神经网络图像实例分割方法研究进展 Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network 计算机科学, 2022, 49(5): 10-24. https://doi.org/10.11896/jsjkx.210200038
[11]	徐涛, 陈奕仁, 吕宗磊. 基于改进YOLOv3的机坪工作人员反光背心检测研究 Study on Reflective Vest Detection for Apron Workers Based on Improved YOLOv3 Algorithm 计算机科学, 2022, 49(4): 239-246. https://doi.org/10.11896/jsjkx.210200119
[12]	张侣, 周博文, 吴亮红. 基于改进卷积注意力模块与残差结构的SSD网络 SSD Network Based on Improved Convolutional Attention Module and Residual Structure 计算机科学, 2022, 49(3): 211-217. https://doi.org/10.11896/jsjkx.201200019
[13]	黄颖琦, 陈红梅. 基于代价敏感卷积神经网络的非平衡问题混合方法 Cost-sensitive Convolutional Neural Network Based Hybrid Method for Imbalanced Data Classification 计算机科学, 2021, 48(9): 77-85. https://doi.org/10.11896/jsjkx.200900013
[14]	赫晓慧, 邱芳冰, 程淅杰, 田智慧, 周广胜. 基于边缘特征融合的高分影像建筑物目标检测 High-resolution Image Building Target Detection Based on Edge Feature Fusion 计算机科学, 2021, 48(9): 140-145. https://doi.org/10.11896/jsjkx.200800002
[15]	袁磊, 刘紫燕, 朱明成, 马珊珊, 陈霖周廷. 融合改进密集连接和分布排序损失的遥感图像检测 Improved YOLOv3 Remote Sensing Target Detection Based on Improved Dense Connection and Distributional Ranking Loss 计算机科学, 2021, 48(9): 168-173. https://doi.org/10.11896/jsjkx.200800001

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

一种新颖的单目视觉深度学习算法:H_SFPN

Novel Deep Learning Algorithm for Monocular Vision:H_SFPN

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0