计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 264-268.doi: 10.11896/jsjkx.201200196

• 计算机图形学&多媒体 • 上一篇    下一篇

基于相邻特征融合的目标检测

李亚泽, 刘宏哲   

  1. 北京联合大学北京市信息服务工程重点实验室 北京100101
    北京联合大学机器人学院 北京100101
  • 收稿日期:2020-12-22 修回日期:2021-06-08 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 刘宏哲(liuhongzhe@buu.edu.cn)
  • 作者简介:yaze_li@126.com
  • 基金资助:
    国家自然科学基金(61871039,61906017,61802019);北京市教委项目(KM202111417001,KM201911417001;视觉智能协同创新中心项目(CYXC2011);北京联合大学学术项目(ZK80202001,202011417004,202011417005)

Object Detection Based on Neighbour Feature Fusion

LI Ya-ze, LIU Hong-zhe   

  1. Beijing Key Laboratory of Information Service Engineering,Beijing Union University,Beijing 100101,China
    College of Robotics,Beijing Union University,Beijing 100101,China
  • Received:2020-12-22 Revised:2021-06-08 Online:2021-12-15 Published:2021-11-26
  • About author:LI Ya-ze,born in 1991,postgraduate.His main research interests include computer vision and object detection.
    LIU Hong-zhe,born in 1971,Ph.D.Her main research interests include compu-ter vision,deep learning,media semantic computing,etc.
  • Supported by:
    National Natural Science Foundation of China(61871039,61906017,61802019),Beijing Municipal Commission of Education Project(KM202111417001,KM201911417001),Collaborative Innovation Center for Visual Intelligence(CYXC2011) and Academic Research Projects of Beijing Union University(ZK80202001,202011417004,202011417005).

摘要: 随着智能驾驶领域的发展,人们对目标检测的精度要求越来越高,尤其是针对高速行驶时对距离较远的小目标的检测和低速行驶时对密集目标的检测。在当前的两阶段检测框架的特征融合部分,使用bottom-up的双向融合方法虽然能够更有效地对大目标进行语义信息和位置信息的特征融合,但会给几个或几十个像素的小目标造成很大的信息损失。当检测网络特征融合部分使用top-down的单向融合方法时,则对大目标检测的效果欠佳。为此,文中提出了相邻特征融合(Neighbour Feature Pyramid Network,NFPN)方法、Double RoI(Region of Interest)方法和递归特征金字塔(Recursive Feature Pyramid,RFP)的方法。以Faster RCNN 50为基准,同时使用提出的NFPN,Double RoI和RFP后,在Lisa交通数据集中平均精度(mAP)提升了2.6个百分点。在VOC2007数据集上,以VOC07+12 train数据集为训练集,VOC2007 test为测试集,以Faster RCNN101为基准,同时使用提出的3个模型,mAP提升了6个百分点,同时小、中、大目标的精度也得到提高。

关键词: 计算机视觉, 目标检测, 深度学习, 特征融合, 智能驾驶

Abstract: With the development of intelligent driving,the precision requirements for target detection are getting higher and higher,especially for small targets that are far away.In the neck of two-stage object detection network,although the feature fusion of semantic information and location information is more effective for large targets if the bottom-up fusion method is used,it will cause big information loss to small targets.To address this problem,we propose neighbor feature pyramid networks(NFPN) method of feature fusion of neighbor layers,the Double RoI(Region of Interest) method to fuse the FPN and NFPN features,and the recursive feature pyramicl(RFP) method.Using Faster RCNN 50 as the benchmark,the mean average precision(mAP) of our model in the Lisa data set has increased by 2.6% while using NFPN,Double RoI and RFP.On the VOC2007 data set,using the VOC07+12 train data set for training,VOC2007 test as the test set,and Faster RCNN101 as the baseline,the mAP of our model both used NFPN,Double RoIE and RFP has increased by 6%,and the object detect accuracy of large,medium and small targets is improved at the same time.

Key words: Autonomous driving, Computer vision, Deep learning, Feature fusion, Object detection

中图分类号: 

  • TP183
[1]MOGELMOSE A,TRIVEDI M M,MOESLUND T B.Vision-based traffic sign detection and analysis for intelligent driver assistance systems:Perspectives and survey[J].IEEE Transactions on Intelligent Transportation Systems,2012,13(4):1484-1497.
[2]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[3]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems.2015:91-99.
[4]LIU S,QI L,QIN H,et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8759-8768.
[5]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[6]JOHN M E.The PASCAL Visual Object Classes Challenge 2007(VOC2007) Development Kit[J].International Journal of Computer Vision,2006,111(1):98-136.
[7]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shot multibox detector[C]//European Conference on Computer Vision.2016:21-37.
[8]TAN M,PANG R,LE Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10781-10790.
[9]PANG J,CHEN K,SHI J,et al.Libra r-cnn:Towards balanced learning for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:821-830.
[10]HE K,GKIOXARI G,DOLLÁR P,et al.Mask r-cnn[C]//Pro- ceedings of the IEEE International Conference on Computer Vision.2017:2961-2969.
[11]REDMON J,FARHADI A.Yolov3:An incremental improve- ment[J].arXiv:1804.02767,2018.
[12]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[13]CAI Z,VASCONCELOS N.Cascade r-cnn:Delving into high quality object detection[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:6154-6162.
[14]QIAO S,CHEN L C,YUILLE A.DetectoRS:Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution[J].arXiv:2006.02334,2020.
[15]LIU Y,WANG Y,WANG S,et al.CBNet:A Novel Composite Backbone Network Architecture for Object Detection[C]//AAAI.2020:11653-11660.
[16]SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training region-based object detectors with online hard example mining[C]//IEEE Conference on Computer Vision & Pattern Recognition.IEEE Computer Society,2016:761-769.
[17]CHEN K,WANG J,PANG J,et al.Mmdetection:Open mmlab detection toolbox and benchmark[J].arXiv:1906.07155,2019.
[18]GAO S,CHENG M M,ZHAO K,et al.Res2Net:A New Multi-scale Backbone Architecture[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(2):652-662,1.
[19]WANG T,YUAN L,ZHANG X,et al.Distilling object detectors with fine-grained feature imitation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:4933-4942.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 刘冬梅, 徐洋, 吴泽彬, 刘倩, 宋斌, 韦志辉.
基于边框距离度量的增量目标检测方法
Incremental Object Detection Method Based on Border Distance Measurement
计算机科学, 2022, 49(8): 136-142. https://doi.org/10.11896/jsjkx.220100132
[8] 王灿, 刘永坚, 解庆, 马艳春.
基于软标签和样本权重优化的Anchor Free目标检测算法
Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization
计算机科学, 2022, 49(8): 157-164. https://doi.org/10.11896/jsjkx.210600240
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[11] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[12] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[13] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[14] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[15] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!