计算机科学, 2023, Vol. 50, Issue (3): 231-237.

• 计算机图形学&多媒体 • 上一篇    下一篇


张卫良, 陈秀宏   

  1. 江南大学人工智能与计算机学院 江苏 无锡 214122
    江苏省媒体设计与软件技术重点实验室 江苏 无锡 214122
  收稿日期:2021-11-28 修回日期:2022-08-20 出版日期:2023-03-15 发布日期:2023-03-15
  • 通讯作者: 陈秀宏(625325682@163.com)
  • 作者简介:(17760867927@163.com)

SSD Object Detection Algorithm with Cross-layer Fusion and Receptive Field Amplification

ZHANG Weiliang, CHEN Xiuhong   

  1. School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China
    Jiangsu Key Laboratory of Media Design and Software Technology,Wuxi,Jiangsu 214122,China
  Received:2021-11-28 Revised:2022-08-20 Online:2023-03-15 Published:2023-03-15
  • About author:ZHANG Weiliang,born in 1997,postgraduate.His main research interests include object detection and so on.
    CHEN Xiuhong,born in 1964,professor.His main research interests include digital image processing,pattern recognition,artificial intelligence and moving targets tracking,etc.

摘要: 鉴于SSD(Single Shot Multibox Detector)不同层缺乏信息的交互以及模型感受野的限制,提出了一种改进的SSD目标检测算法——ESSD(Enhanced SSD),以提高目标检测的准确性。首先,使用SSD模型中原有的多尺度特征图,利用FPN(Feature Pyramid Networks)的思想,设计了一种跨层信息交互模块,在增强了不同层的语义信息能力的同时减小了不同层的信息差异。然后,为了提高模型的感受野和多尺度检测能力,设计了一种感受野扩增模块。最后,采用批处理归一化层缩短训练时间,以提高模型的收敛速度。为了评价ESSD的有效性,在PASCAL VOC2007测试集以及PASCAL VOC2012测试集上进行了实验。实验结果表明,在PASCAL VOC2007数据集上其mAP为82.1%且检测速度为15.7FPS,相比原有的SSD512,其mAP提升了2.3%;在PASCAL VOC2012测试集上其mAP达到了80.6%,也比SSD512高2.1%。实验证明了ESSD检测器在达到较高检测精度的情况下,仍然可以满足实时性。

关键词: 目标检测, 信息融合, 感受野, 多尺度, SSD

Abstract: In view of the lack of information interaction between different layers of single shot multibox detector(SSD) and the limitation of the model's receptive field,an improved SSD object detection algorithm,named ESSD(enhanced SSD),is proposed to improve the accuracy of object detection.First of all,using the original multi-scale feature map in the SSD model and using the idea of feature pyramid networks(FPN),a cross-layer information interaction module is designed,which enhances the semantic information capabilities of different layers and reduces the information difference of different layers.Then,in order to improve the receptive field and multi-scale detection capabilities of the model,a receptive field amplification module is designed.Finally,the batch normalization layer is used to reduce the training time and improve the convergence speed of the model.In order to evaluate the effectiveness of ESSD,experiments are conducted on the PASCAL VOC2007 and PASCAL VOC2012 test sets.Experimental results show that on the PASCAL VOC2007 data set,its mAP is 82.1% and the detection speed is 15.7FPS.Compared with the original SSD512,its mAP increases by 2.3%;on the PASCAL VOC2012 test set,its mAP reaches 80.6%,which is also 2.1% higher than SSD512.Experiments have proved that the ESSD detector can still meet the real-time performance under the condition of high detection accuracy.

Key words: Object detection, Information fusion, Receptive field, Multi-scale, SSD


  • TP391
